Multimodal Retrieval-augmented Generation Framework Built by Tongyi Lab, Alibaba Group.
☆465Feb 17, 2026Updated 2 weeks ago
Alternatives and similar repositories for VRAG
Users that are interested in VRAG are comparing it to the libraries listed below
Sorting:
- [EMNLP 2025] ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆632Jan 11, 2026Updated last month
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆402Aug 26, 2025Updated 6 months ago
- Parsing-free RAG supported by VLMs☆917Dec 7, 2025Updated 2 months ago
- VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning☆35Jul 15, 2025Updated 7 months ago
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆414Apr 22, 2025Updated 10 months ago
- OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.☆354Jun 1, 2025Updated 9 months ago
- A holistic framework for advancing LLMs as data science agents☆33Feb 3, 2026Updated last month
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward☆92Aug 8, 2025Updated 6 months ago
- [CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models☆233Nov 7, 2025Updated 3 months ago
- A Survey on Multimodal Retrieval-Augmented Generation☆483Feb 20, 2026Updated last week
- ☆63Jul 11, 2025Updated 7 months ago
- [ACM MM 2025 🔥🔥 ] MIRA: A first-of-its-kind medical RAG framework that fuses image features and retrieved knowledge with dynamic contex…☆18Aug 28, 2025Updated 6 months ago
- ☆34Dec 18, 2025Updated 2 months ago
- Repo for "MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability"☆150May 27, 2025Updated 9 months ago
- EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL☆4,649Updated this week
- The official repository of NodeRAG☆412Mar 19, 2025Updated 11 months ago
- ☆1,137Nov 20, 2025Updated 3 months ago
- [MTI-LLM@NeurIPS 2025] Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."☆152Jul 22, 2025Updated 7 months ago
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆105Sep 18, 2025Updated 5 months ago
- ☆47Apr 9, 2025Updated 10 months ago
- ☆39Aug 4, 2025Updated 7 months ago
- EMNLP MAIN 2025 StepSearch: Igniting LLMs Search Ability via Step-Wise Proximal Policy Optimization☆59Sep 13, 2025Updated 5 months ago
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆4,085Nov 13, 2025Updated 3 months ago
- [EMNLP 2024: Demo Oral] RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation☆310Oct 18, 2024Updated last year
- ☆123Jan 19, 2026Updated last month
- Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcemen…☆608Feb 15, 2026Updated 2 weeks ago
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning☆689Aug 5, 2025Updated 7 months ago
- ☆524Feb 4, 2026Updated last month
- ☆64Feb 4, 2026Updated last month
- [AAAI-26] Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?☆26Dec 14, 2025Updated 2 months ago
- Official Implementation for *PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling*☆32Dec 13, 2025Updated 2 months ago
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding [ICML 2025]]☆45Jul 22, 2025Updated 7 months ago
- A library for generating difficulty-scalable, multi-tool, and verifiable agentic tasks with execution trajectories.☆178Jul 6, 2025Updated 7 months ago
- Tongyi Deep Research, the Leading Open-source Deep Research Agent☆18,337Updated this week
- ☆109Aug 14, 2025Updated 6 months ago
- The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.☆2,534Feb 24, 2026Updated last week
- [CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation☆1,529Updated this week
- MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment☆35Jul 1, 2024Updated last year
- A pre-built agent for TableGPT2.☆632Feb 11, 2026Updated 3 weeks ago