Tencent-Hunyuan / HunyuanVisionLinks
☆49Updated this week
Alternatives and similar repositories for HunyuanVision
Users that are interested in HunyuanVision are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] Official PyTorch Implementation of "FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner"☆70Updated last year
- ☆78Updated 5 months ago
- Official PyTorch implementation of TokenSet.☆125Updated 6 months ago
- AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model☆44Updated 3 months ago
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆34Updated last year
- Towards training VQ-VAE models robustly!☆84Updated 2 months ago
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation☆87Updated 7 months ago
- LVAS-Agent Code Base☆21Updated 5 months ago
- Automatic Video Generation from Scientific Papers☆104Updated this week
- [NeurIPS 2025] HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation☆67Updated 3 weeks ago
- A unified framework for controllable caption generation across images, videos, and audio. Supports multi-modal inputs and customizable ca…☆51Updated 2 months ago
- A one-stop library to standardize the inference and evaluation of all the conditional video generation models.☆49Updated 8 months ago
- LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.☆37Updated last year
- [ICLR 2025] Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching☆52Updated 5 months ago
- [CVPR 2025] Parallel Sequence Modeling via Generalized Spatial Propagation Network☆106Updated 2 months ago
- An official implementation of SwapAnyone.☆70Updated 6 months ago
- The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation☆35Updated 5 months ago
- [NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective☆71Updated 11 months ago
- [Preprint] GMem: A Modular Approach for Ultra-Efficient Generative Models☆39Updated 7 months ago
- ☆129Updated 3 months ago
- VideoAuteur: Towards Long Narrative Video Generation☆43Updated 9 months ago
- The official PyTorch implementation for Improving Long-Text Alignment for Text-to-Image Diffusion Models (LongAlign)☆80Updated 5 months ago
- [Preprint] UCGM: Unified Continuous Generative Models☆168Updated 4 months ago
- minisora-DiT, a DiT reproduction based on XTuner from the open source community MiniSora☆40Updated last year
- Image Tokenizer Needs Post-Training☆20Updated last week
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆42Updated last year
- High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning☆48Updated 2 months ago
- Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Vision☆150Updated 2 weeks ago
- TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation☆33Updated 10 months ago
- ☆70Updated 10 months ago