rhymes-ai / Aria
Codebase for Aria - an Open Multimodal Native MoE
☆1,030Updated 2 months ago
Alternatives and similar repositories for Aria:
Users that are interested in Aria are comparing it to the libraries listed below
- Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation☆741Updated 8 months ago
- A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.☆879Updated 2 weeks ago
- An Open Large Reasoning Model for Real-World Solutions☆1,482Updated last month
- Next-Token Prediction is All You Need☆2,064Updated 3 weeks ago
- OLMoE: Open Mixture-of-Experts Language Models☆701Updated 3 weeks ago
- ☆810Updated 2 weeks ago
- Muon is Scalable for LLM Training☆1,005Updated 2 weeks ago
- Official PyTorch implementation for "Large Language Diffusion Models"☆1,424Updated this week
- ☆1,349Updated 4 months ago
- Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.☆1,977Updated 8 months ago
- Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"☆836Updated 7 months ago
- An Open-source RL System from ByteDance Seed and Tsinghua AIR☆1,053Updated this week
- Large Reasoning Models☆799Updated 4 months ago
- A fork to add multimodal model training to open-r1☆1,170Updated 2 months ago
- Dream 7B, a large diffusion language model☆444Updated this week
- LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning☆1,938Updated 2 months ago
- Scalable RL solution for advanced reasoning of language models☆1,467Updated 3 weeks ago
- [CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents☆1,559Updated this week
- Rethinking Step-by-step Visual Reasoning in LLMs☆286Updated 2 months ago
- Explore the Multimodal “Aha Moment” on 2B Model☆555Updated 3 weeks ago
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆1,722Updated this week
- LLM2CLIP makes SOTA pretrained CLIP model more SOTA ever.☆505Updated 2 weeks ago
- The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention☆2,473Updated last week
- Frontier Multimodal Foundation Models for Image and Video Understanding☆712Updated this week
- Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…☆571Updated last week
- Code release for "LLMs can see and hear without any training"☆230Updated last month
- Democratizing Reinforcement Learning for LLMs☆2,399Updated this week
- [CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.☆1,158Updated 3 weeks ago
- ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning☆610Updated this week
- ☆538Updated last week