Alibaba-NLP / VRAGLinks
Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning"
☆318Updated 2 months ago
Alternatives and similar repositories for VRAG
Users that are interested in VRAG are comparing it to the libraries listed below
Sorting:
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆369Updated 4 months ago
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆296Updated this week
- Deep Research Agent CognitiveKernel-Pro from Tencent AI Lab. Paper: https://arxiv.org/pdf/2508.00414☆313Updated this week
- GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation☆324Updated this week
- Agentic RAG R1 Framework via Reinforcement Learning☆285Updated 3 months ago
- R1-onevision, a visual language model capable of deep CoT reasoning.☆561Updated 4 months ago
- [ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval☆219Updated 3 months ago
- Collect every awesome work about r1!☆413Updated 3 months ago
- PC Agent: While You Sleep, AI Works - A Cognitive Journey into Digital World☆280Updated 3 months ago
- MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding☆201Updated 3 weeks ago
- ☆318Updated 2 months ago
- The development and future prospects of multimodal reasoning models.☆481Updated 3 weeks ago
- MiroFlow is an agent framework that simplifies the development of complex, multi-agent systems. Build, manage, and scale your AI agents w…☆384Updated this week
- FlexRAG: A RAG Framework for Information Retrieval and Generation.☆214Updated 2 months ago
- MMR1: Advancing the Frontiers of Multimodal Reasoning☆163Updated 5 months ago
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆243Updated 2 weeks ago
- ☆755Updated last month
- [EMNLP 2025] ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆531Updated 2 months ago
- Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision☆216Updated this week
- A Survey on Multimodal Retrieval-Augmented Generation☆319Updated last week
- Qwen DianJin: LLMs for the Financial Industry by Alibaba Cloud☆242Updated this week
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning☆625Updated 3 weeks ago
- Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train …☆214Updated 2 months ago
- ☆365Updated 6 months ago
- 一个面向多模态大模型训练的智能数据集构建与评估平台☆97Updated last week
- Awesome Agent Training☆215Updated 3 weeks ago
- This is a user guide for the MiniCPM and MiniCPM-V series of small language models (SLMs) developed by ModelBest. “面壁小钢炮” focuses on achi…☆277Updated last month
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆68Updated 11 months ago
- Code for "UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning"☆126Updated 3 months ago
- ☆173Updated 6 months ago