PKU-Alignment / align-anythingLinks
Align Anything: Training All-modality Model with Feedback
☆4,026Updated 3 weeks ago
Alternatives and similar repositories for align-anything
Users that are interested in align-anything are comparing it to the libraries listed below
Sorting:
- One for All Modalities Evaluation Toolkit - including text, image, video, audio tasks.☆2,659Updated this week
- Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)☆1,859Updated 2 weeks ago
- minimal-cost for training 0.5B R1-Zero☆743Updated last month
- Skywork-R1V2:Multimodal Hybrid Reinforcement Learning for Reasoning☆2,634Updated 2 weeks ago
- Train your Agent model via our easy and efficient framework☆1,175Updated this week
- A fork to add multimodal model training to open-r1☆1,309Updated 4 months ago
- EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL☆2,764Updated this week
- Build multimodal language agents for fast prototype and production☆2,512Updated 3 months ago
- adds Sequence Parallelism into LLaMA-Factory☆518Updated last week
- The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"☆733Updated last month
- Ola: Pushing the Frontiers of Omni-Modal Language Model☆344Updated 2 weeks ago
- DeepRetrieval - 🔥 Training Search Agent with Retrieval Outcomes via Reinforcement Learning☆569Updated last week
- MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning☆665Updated 3 weeks ago
- Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS☆1,193Updated 2 months ago
- Explore the Multimodal “Aha Moment” on 2B Model☆594Updated 3 months ago
- ✨✨VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model☆588Updated last month
- [NeurIPS 2024] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions☆1,064Updated 8 months ago
- [ICLR 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.☆1,480Updated this week
- Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.☆776Updated last month
- Next-Token Prediction is All You Need☆2,152Updated 3 months ago
- R1-onevision, a visual language model capable of deep CoT reasoning.☆528Updated 2 months ago
- Real-time and accurate open-vocabulary end-to-end object detection☆1,324Updated 6 months ago
- This is the first paper to explore how to effectively use RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-sta…☆613Updated last week
- "VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos"☆722Updated this week
- Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey☆663Updated this week
- An MBTI Exploration of Large Language Models☆485Updated last year
- [CVPR 2025] The First Investigation of CoT Reasoning (RL, TTS, Reflection) in Image Generation☆735Updated last month
- A tutorial based on MetaGPT to quickly help you understand the concept of agent and muti-agent and get started with coding development. 基…☆1,235Updated last year
- [Up-to-date] Large Language Model Agent: A Survey on Methodology, Applications and Challenges☆932Updated last week
- Recipes to train reward model for RLHF.☆1,386Updated 2 months ago