HITsz-TMG / UMOE-Scaling-Unified-Multimodal-LLMs
The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"
☆754Updated last week
Related projects: ⓘ
- 【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models☆1,766Updated 2 weeks ago
- An MBTI Exploration of Large Language Models☆446Updated 7 months ago
- Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation☆1,086Updated 10 months ago
- [ ICLR 2024 ] Official Codebase for "InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists"☆517Updated 4 months ago
- A tutorial based on MetaGPT to quickly help you understand the concept of agent and muti-agent and get started with coding development. 基…☆1,329Updated 4 months ago
- SDG is a specialized framework designed to generate high-quality structured tabular data.☆3,258Updated this week
- An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions☆1,220Updated last month
- ☆1,960Updated 2 months ago
- Your Automatic Prompt Engineering Assistant for GenAI Applications☆2,623Updated 4 months ago
- OMG-LLaVA and OMG-Seg codebase☆1,222Updated last month
- A Doctor for your data☆3,069Updated last month
- CSGHub Server is the backend server for CSGHub which helps user to manage datasets, model files, codes and more. CSGHub Server是开源大模型资产管理平…☆408Updated this week
- Real-time and accurate open-vocabulary end-to-end object detection☆1,482Updated last week
- airda(Air Data Agent)是面向数据分析的多智能体,能够理解数据开发和数据分析需求、理解数据、生成面向数据查询、数据可视化、机器学习等任务的SQL和Python代码☆2,133Updated 2 months ago
- CSGHub is an opensource large model assets platform just like on-premise huggingface which helps to manage datasets, model files, codes a…☆2,784Updated this week
- A Semantic Controllable Self-Supervised Learning Framework to learn general human representations from massive unlabeled human images, wh…☆1,899Updated last year
- cube studio开源云原生一站式机器学习/深度学习/大模型AI平台,支持sso登录,多租户,大数据平台对接,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡分布式训练,超参搜索,推理服务VGPU,边缘计算,serverless,标注平台,自动化标注…☆1,664Updated this week
- (AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions☆260Updated 5 months ago
- Awesome LLMs on Device: A Comprehensive Survey☆613Updated this week
- SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling☆654Updated last week
- Structure your STEM essay in several minutes with Generative AI.☆1,038Updated 3 weeks ago
- [CVPR 2024] Official code for "Text-Driven Image Editing via Learnable Regions"☆260Updated last month
- Matryoshka Query Transformer for Large Vision-Language Models☆88Updated 2 months ago
- Create textures for 3d models using stable-diffusion and blender☆833Updated last year
- [NeurIPS 2022] Official Code for REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering☆132Updated last year
- Accelerating the development of large multimodal models (LMMs) with lmms-eval☆1,334Updated this week
- Multilingual Corpus of Web Fiction☆211Updated 2 months ago
- ☆212Updated 8 months ago
- PyTorch Implementation of AudioLCM (ACM-MM'24): a efficient and high-quality text-to-audio generation with latent consistency model.☆1,114Updated 2 months ago
- Code for paper "GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators"☆192Updated last month