HITsz-TMG / UMOE-Scaling-Unified-Multimodal-LLMs
The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"
☆796Updated last week
Alternatives and similar repositories for UMOE-Scaling-Unified-Multimodal-LLMs:
Users that are interested in UMOE-Scaling-Unified-Multimodal-LLMs are comparing it to the libraries listed below
- Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.☆1,987Updated this week
- An MBTI Exploration of Large Language Models☆447Updated 11 months ago
- 【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models☆1,683Updated 2 weeks ago
- [IJCV] Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation☆927Updated 2 months ago
- [ ICLR 2024 ] Official Codebase for "InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists"☆462Updated 8 months ago
- [NeurIPS 2024] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions☆1,012Updated 3 months ago
- ☆1,365Updated 3 months ago
- SDG is a specialized framework designed to generate high-quality structured tabular data.☆2,280Updated this week
- DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models☆127Updated this week
- [NeurIPS 2024] Matryoshka Query Transformer for Large Vision-Language Models☆90Updated 6 months ago
- (AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions☆250Updated 9 months ago
- Your Automatic Prompt Engineering Assistant for GenAI Applications☆2,082Updated 8 months ago
- OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]☆1,199Updated last month
- Real-time and accurate open-vocabulary end-to-end object detection☆1,135Updated last month
- ☆157Updated 3 months ago
- A tutorial based on MetaGPT to quickly help you understand the concept of agent and muti-agent and get started with coding development. 基…☆1,092Updated 8 months ago
- The official repository for paper "Tora: Trajectory-oriented Diffusion Transformer for Video Generation"☆932Updated last week
- csghub-server is the backend server for CSGHub which helps user to manage datasets, modes, and also run Model Inference, Finetune and App…☆549Updated this week
- Build multimodal language agents for very fast prototype and production☆1,198Updated this week
- Improving Generalist Model with Domain-Specific Experts☆79Updated last week
- Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs☆212Updated 3 months ago
- A Doctor for your data☆2,416Updated this week
- Structure your STEM essay in several minutes with Generative AI.☆701Updated 4 months ago
- Official repository of MMGenBench☆88Updated last month
- [NeurIPS 2022] Official Code for REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering☆98Updated 3 months ago
- Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation☆3,419Updated last month
- Code for paper "GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators"☆189Updated 5 months ago
- [CVPR 2024] Official code for "Text-Driven Image Editing via Learnable Regions"☆209Updated 3 months ago
- A Semantic Controllable Self-Supervised Learning Framework to learn general human representations from massive unlabeled human images, wh…☆1,430Updated last year
- The official implementation of our pre-print paper "AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs".☆253Updated 2 months ago