PKU-Alignment / align-anything
Align Anything: Training All-modality Model with Feedback
☆2,486Updated this week
Alternatives and similar repositories for align-anything:
Users that are interested in align-anything are comparing it to the libraries listed below
- Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.☆2,153Updated this week
- The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"☆694Updated last month
- 【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models☆1,710Updated last week
- Build multimodal language agents for fast prototype and production☆2,101Updated last week
- adds Sequence Parallelism into LLaMA-Factory☆261Updated this week
- [NeurIPS 2024] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions☆1,041Updated 4 months ago
- Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS☆779Updated 2 weeks ago
- An MBTI Exploration of Large Language Models☆459Updated last year
- [CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception☆549Updated 9 months ago
- A tutorial based on MetaGPT to quickly help you understand the concept of agent and muti-agent and get started with coding development. 基…☆1,144Updated 9 months ago
- Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models☆888Updated last month
- OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]☆1,238Updated 2 months ago
- Real-time and accurate open-vocabulary end-to-end object detection☆1,294Updated 2 months ago
- Recipes to train reward model for RLHF.☆1,205Updated 3 weeks ago
- Unified KV Cache Compression Methods for Auto-Regressive Models☆911Updated 2 months ago
- [NeurIPS 2024] BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models☆244Updated 3 months ago
- Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conv…☆413Updated 2 months ago
- [ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.☆1,230Updated this week
- Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition☆279Updated last month
- Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)☆2,414Updated 11 months ago
- [ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列☆1,053Updated 8 months ago