AILab-CVC / SEED-X
Multimodal Models in Real World
☆403Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for SEED-X
- Official repository for the paper PLLaVA☆593Updated 3 months ago
- Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…☆500Updated 3 months ago
- Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"☆371Updated 2 months ago
- ☆349Updated last month
- ☆166Updated 4 months ago
- ☆173Updated 3 months ago
- Code repository for T2V-Turbo and T2V-Turbo-v2☆250Updated 3 weeks ago
- VCoder: Versatile Vision Encoders for Multimodal Large Language Models, arXiv 2023 / CVPR 2024☆261Updated 7 months ago
- [CVPR2024 Highlight] VBench - We Evaluate Video Generation☆580Updated 2 weeks ago
- ☆278Updated 2 weeks ago
- Official implementation of SEED-LLaMA (ICLR 2024).☆579Updated 2 months ago
- [CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers☆526Updated 3 weeks ago
- 🔥🔥First-ever hour scale video understanding models☆166Updated 3 weeks ago
- Long Context Transfer from Language to Vision☆334Updated 3 weeks ago
- This is the official implementation of "Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams"☆130Updated 3 months ago
- MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer☆197Updated 7 months ago
- LaVIT: Empower the Large Language Model to Understand and Generate Visual Content☆532Updated last month
- ☆145Updated 2 months ago
- ☆217Updated 7 months ago
- 🔥 CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models☆195Updated 4 months ago
- MuLan: Adapting Multilingual Diffusion Models for 110+ Languages (无需额外训练为任意扩散模型支持多语言能力)☆127Updated 5 months ago
- ☆254Updated 3 months ago
- I'm back! Implementations of Meissonic developed by Community~If you feel it is helpful, plz consider giving a star❤️☆249Updated last week
- SCEPTER is an open-source framework used for training, fine-tuning, and inference with generative models.☆428Updated 2 weeks ago
- [NeurIPS 2024] VideoTetris: Towards Compositional Text-To-Video Generation☆206Updated 2 weeks ago
- [ICLR 2024] Code for FreeNoise based on VideoCrafter☆386Updated 4 months ago
- [ICLR 2024 Spotlight] DreamLLM: Synergistic Multimodal Comprehension and Creation