Alibaba-NLP / VRAGLinks
Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning"
☆364Updated 3 months ago
Alternatives and similar repositories for VRAG
Users that are interested in VRAG are comparing it to the libraries listed below
Sorting:
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆384Updated 5 months ago
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆326Updated last month
- MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding☆230Updated 2 months ago
- R1-onevision, a visual language model capable of deep CoT reasoning.☆568Updated 5 months ago
- Deep Research Agent CognitiveKernel-Pro from Tencent AI Lab. Paper: https://arxiv.org/pdf/2508.00414☆353Updated last month
- GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation☆381Updated last week
- Collect every awesome work about r1!☆418Updated 5 months ago
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.☆621Updated 5 months ago
- FlexRAG: A RAG Framework for Information Retrieval and Generation.☆222Updated 3 months ago
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning☆640Updated 2 months ago
- Agentic RAG R1 Framework via Reinforcement Learning☆302Updated 2 weeks ago
- MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources☆197Updated 2 weeks ago
- [EMNLP 2025] ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆577Updated 3 months ago
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.☆438Updated last month
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆264Updated last month
- ☆851Updated last month
- ☆356Updated 3 months ago
- MiroMind Research Agent: Fully Open-Source Deep Research Agent with Reproducible State-of-the-Art Performance on FutureX, GAIA, HLE, Brow…☆698Updated this week
- Parsing-free RAG supported by VLMs☆799Updated 7 months ago
- The development and future prospects of multimodal reasoning models.☆508Updated 2 months ago
- [ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval☆225Updated 4 months ago
- Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision☆237Updated 3 weeks ago
- PC Agent: While You Sleep, AI Works - A Cognitive Journey into Digital World☆284Updated 4 months ago
- Awesome Agent Training☆233Updated last month
- Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning☆820Updated 2 months ago
- ☆985Updated 2 weeks ago
- ☆408Updated last month
- A Survey on Multimodal Retrieval-Augmented Generation☆374Updated 2 weeks ago
- Explore the Multimodal “Aha Moment” on 2B Model☆608Updated 6 months ago
- [CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness☆413Updated 4 months ago