Tencent-Hunyuan/HunyuanVision

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Tencent-Hunyuan/HunyuanVision)

Tencent-Hunyuan / HunyuanVision

☆94

Alternatives and similar repositories for HunyuanVision

Users that are interested in HunyuanVision are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Tencent-Hunyuan / Hunyuan-1.8B
View on GitHub
☆43Aug 5, 2025Updated 11 months ago
baaivision / Emu3.5
View on GitHub
Native Multimodal Models are World Learners
☆1,538Dec 30, 2025Updated 6 months ago
Robertwyq / Object-Affinity
View on GitHub
[TPAMI 2023] Object Affinity Learning: Towards Annotation-free Instance Segmentation
☆14Sep 14, 2023Updated 2 years ago
MCG-NKU / ExperiCV
View on GitHub
Initial code for computer vision experiments
☆11Jan 1, 2023Updated 3 years ago
shiml20 / FlowTurbo
View on GitHub
[TPAMI 26/ NeurIPS 24] Official PyTorch Implementation of "FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner…
☆75Oct 21, 2025Updated 9 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Tencent-Hunyuan / HunyuanImage-3.0
View on GitHub
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
☆3,203Jun 23, 2026Updated last month
EsmaeilNarimissa / aws-sft-grpo-budget-llm-finetune
View on GitHub
☆19May 17, 2025Updated last year
wangjiangshan0725 / Elastic-DiT
View on GitHub
[ICML 2026] Elastic Diffusion Transformer: Accelerating SOTA generation models (e.g., Qwen-Image, Hunyuan3d ) through adaptive computatio…
☆49May 1, 2026Updated 2 months ago
Tencent-Hunyuan / HunyuanImage-2.1
View on GitHub
HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation
☆673Oct 14, 2025Updated 9 months ago
MoonshotAI / Kimi-VL
View on GitHub
Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities
☆1,214Jul 15, 2025Updated last year
LINs-lab / UCGM
View on GitHub
[Preprint] UCGM: Unified Continuous Generative Models
☆185May 27, 2025Updated last year
heshitian / PLUG
View on GitHub
Learning Remote Sensing Object Detection with Single Point Supervision
☆18Dec 18, 2023Updated 2 years ago
DAMO-NLP-SG / multimodal_textbook
View on GitHub
[ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"
☆196Mar 17, 2025Updated last year
dongyh20 / Insight-V
View on GitHub
[CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
☆240Nov 7, 2025Updated 8 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
liuzuyan / ElasticCache
View on GitHub
[ECCV 2024] Efficient Inference of Vision Instruction-Following Models with Elastic Cache
☆43Jul 26, 2024Updated 2 years ago
aim-uofa / SegPrompt
View on GitHub
Official Implementation of ICCV 2023 Paper - SegPrompt: Boosting Open-World Segmentation via Category-level Prompt Learning
☆112May 28, 2025Updated last year
donglixp / ICL_PaperList
View on GitHub
Paper List for In-context Learning 🌷
☆19Jan 3, 2023Updated 3 years ago
Vchitect / RealDPO
View on GitHub
☆32Dec 17, 2025Updated 7 months ago
LeapLabTHU / CODA
View on GitHub
CODA: Repurposing Continuous VAEs for Discrete Tokenization
☆37Jul 4, 2025Updated last year
henanjun / SccovNet
View on GitHub
SccovNet for remote sensing scene image classification which accepted by TNNLS
☆13Jun 24, 2019Updated 7 years ago
inclusionAI / Ming-UniVision
View on GitHub
Code release for Ming-UniVision: Joint Image Understanding and Geneation with a Continuous Unified Tokenizer
☆143Oct 14, 2025Updated 9 months ago
Beckschen / ViTamin
View on GitHub
[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"
☆211Jun 9, 2024Updated 2 years ago
Ackesnal / RePaViT
View on GitHub
This is the official code for paper [RePaViT: Scalable Vision Transformer Acceleration via Structural Reparameterization on Feedforward N…
☆18Jun 20, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
X-Omni-Team / X-Omni
View on GitHub
Official inference code and LongText-Bench benchmark for our paper X-Omni (https://arxiv.org/pdf/2507.22058).
☆426Aug 26, 2025Updated 11 months ago
ShoufaChen / gradio-box
View on GitHub
☆20Jul 11, 2023Updated 3 years ago
MKJia / MGVQ
View on GitHub
[Arxiv'25] MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization
☆55Sep 16, 2025Updated 10 months ago
Tencent-Hunyuan / Tencent-Hunyuan-7B-0124
View on GitHub
☆29Aug 21, 2025Updated 11 months ago
meituan-longcat / LongCat-Flash-Omni
View on GitHub
This is the official repo for the paper "LongCat-Flash-Omni Technical Report"
☆501May 9, 2026Updated 2 months ago
Junjue-Wang / CapFormer
View on GitHub
[IGARSS 2022] CapFormer: Pure transformer for remote sensing image caption
☆21Oct 6, 2022Updated 3 years ago
hitcslj / Mesh-RFT
View on GitHub
[NeurIPS 2025 Spotlight] Mesh-RFT: Enhancing Mesh Generation via Fine-grained Reinforcement Fine-Tuning
☆40May 20, 2025Updated last year
ByteDance-Seed / Seed1.5-VL
View on GitHub
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving stat…
☆1,583Jun 14, 2025Updated last year
WePOINTS / WePOINTS
View on GitHub
☆190Mar 13, 2026Updated 4 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Tencent / Light-SQ
View on GitHub
☆17Jul 8, 2026Updated 3 weeks ago
cwitkowitz / ss-mpe
View on GitHub
Code for the paper "Toward Fully Self-Supervised Multi-Pitch Estimation".
☆25Sep 27, 2025Updated 10 months ago
zai-org / GLM-V
View on GitHub
GLM-4.6V/4.5V/4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
☆2,361Jul 21, 2026Updated last week
BytedanceDouyinContent / SAIL-VL2
View on GitHub
The SAIL-VL2 series model developed by the BytedanceDouyinContent Group
☆79Sep 18, 2025Updated 10 months ago
ByteDance-Seed / Seed-1.8
View on GitHub
☆219Dec 19, 2025Updated 7 months ago
pozalabs / MID-FiLD
View on GitHub
[AAAI'24] Official dataset & demo code for MID-FiLD: MIDI Dataset for Fine-Level Dynamics
☆21Mar 31, 2024Updated 2 years ago
huutuongtu / Lightvoc
View on GitHub
LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM
☆18May 17, 2024Updated 2 years ago