Alibaba-NLP / VRAGLinks
Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning"
☆423Updated 2 months ago
Alternatives and similar repositories for VRAG
Users that are interested in VRAG are comparing it to the libraries listed below
Sorting:
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆403Updated 8 months ago
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆372Updated 4 months ago
- MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding☆270Updated 4 months ago
- Parsing-free RAG supported by VLMs☆889Updated 3 weeks ago
- R1-onevision, a visual language model capable of deep CoT reasoning.☆574Updated 8 months ago
- The paper list of "Memory in the Age of AI Agents: A Survey"☆625Updated this week
- Agentic RAG R1 Framework via Reinforcement Learning☆362Updated 2 weeks ago
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning☆667Updated 4 months ago
- Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision☆291Updated 2 months ago
- A Survey on Multimodal Retrieval-Augmented Generation☆451Updated last month
- [ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval☆238Updated last month
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.☆679Updated 2 months ago
- [EMNLP 2025] ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆619Updated 6 months ago
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆298Updated 2 months ago
- ☆1,177Updated 2 months ago
- The development and future prospects of large multimodal reasoning models.☆564Updated 4 months ago
- ☆472Updated 2 weeks ago
- ☆472Updated 2 months ago
- ☆1,039Updated last month
- Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning☆1,116Updated last month
- ☆403Updated 2 months ago
- Collect every awesome work about r1!☆426Updated 7 months ago
- MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources☆211Updated 3 months ago
- FlexRAG: A RAG Framework for Information Retrieval and Generation.☆230Updated this week
- ✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehensi…☆368Updated 2 months ago
- Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train …☆220Updated 6 months ago
- A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.☆844Updated 5 months ago
- Implementation for OAgents: An Empirical Study of Building Effective Agents☆298Updated 2 months ago
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.☆510Updated 3 months ago
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆458Updated this week