Alibaba-NLP / VRAGLinks
Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning"
☆397Updated last month
Alternatives and similar repositories for VRAG
Users that are interested in VRAG are comparing it to the libraries listed below
Sorting:
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆393Updated 6 months ago
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆348Updated 2 months ago
- MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding☆248Updated 3 months ago
- R1-onevision, a visual language model capable of deep CoT reasoning.☆570Updated 7 months ago
- Agentic RAG R1 Framework via Reinforcement Learning☆321Updated last week
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.☆486Updated 2 months ago
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.☆657Updated last month
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆289Updated 3 weeks ago
- [ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval☆234Updated last week
- MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources☆208Updated last month
- Parsing-free RAG supported by VLMs☆853Updated 3 weeks ago
- Collect every awesome work about r1!☆421Updated 6 months ago
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning☆653Updated 3 months ago
- FlexRAG: A RAG Framework for Information Retrieval and Generation.☆226Updated 5 months ago
- ☆382Updated last month
- PC Agent: While You Sleep, AI Works - A Cognitive Journey into Digital World☆297Updated 5 months ago
- A Survey on Multimodal Retrieval-Augmented Generation☆421Updated last week
- ☆971Updated 3 weeks ago
- MiroMind Research Agent: Fully Open-Source Deep Research Agent with Reproducible State-of-the-Art Performance on FutureX, GAIA, HLE, Brow…☆852Updated last week
- ☆423Updated last month
- The development and future prospects of large multimodal reasoning models.☆545Updated 3 months ago
- [EMNLP 2025] ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆594Updated 5 months ago
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆241Updated 3 months ago
- ☆1,093Updated 3 weeks ago
- A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.☆785Updated 3 months ago
- ✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehensi…☆336Updated 3 weeks ago
- Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision☆264Updated last month
- Pre-trained, Scalable, High-performance Reward Models via Policy Discriminative Learning.☆160Updated last month
- Explore the Multimodal “Aha Moment” on 2B Model☆615Updated 8 months ago
- Official code for Dynamic Parametric RAG.☆162Updated 3 months ago