Multimodal Retrieval-augmented Generation Framework Built by Tongyi Lab, Alibaba Group.
☆486Feb 17, 2026Updated last month
Alternatives and similar repositories for VRAG
Users that are interested in VRAG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [EMNLP 2025] ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆645Jan 11, 2026Updated 2 months ago
- VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning☆36Jul 15, 2025Updated 8 months ago
- Parsing-free RAG supported by VLMs☆939Dec 7, 2025Updated 3 months ago
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆411Aug 26, 2025Updated 6 months ago
- A Survey on Multimodal Retrieval-Augmented Generation☆493Feb 20, 2026Updated last month
- ☆47Apr 9, 2025Updated 11 months ago
- [ACM MM 2025 🔥🔥 ] MIRA: A first-of-its-kind medical RAG framework that fuses image features and retrieved knowledge with dynamic contex…☆20Aug 28, 2025Updated 6 months ago
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆416Apr 22, 2025Updated 11 months ago
- ☆1,161Nov 20, 2025Updated 4 months ago
- OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.☆356Jun 1, 2025Updated 9 months ago
- ☆36Dec 18, 2025Updated 3 months ago
- EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL☆4,748Mar 10, 2026Updated 2 weeks ago
- Official implementation of MATPO: Multi-Agent Tool-Integrated Policy Optimization.☆77Oct 31, 2025Updated 4 months ago
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward☆93Aug 8, 2025Updated 7 months ago
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆4,261Nov 13, 2025Updated 4 months ago
- Efficient retrieval head analysis with triton flash attention that supports topK probability☆13Jun 15, 2024Updated last year
- ☆40Aug 4, 2025Updated 7 months ago
- [CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models☆237Nov 7, 2025Updated 4 months ago
- Repo for "MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability"☆149May 27, 2025Updated 9 months ago
- A holistic framework for advancing LLMs as data science agents☆39Feb 3, 2026Updated last month
- The official repository of NodeRAG☆410Mar 19, 2025Updated last year
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning☆700Aug 5, 2025Updated 7 months ago
- ☆65May 19, 2025Updated 10 months ago
- EMNLP MAIN 2025 StepSearch: Igniting LLMs Search Ability via Step-Wise Proximal Policy Optimization☆60Sep 13, 2025Updated 6 months ago
- [EMNLP 2025] The official implementation of "Zero-shot Multimodal Document Retrieval via Cross-Modal Question Generation"☆15Aug 26, 2025Updated 6 months ago
- ☆66Jan 4, 2026Updated 2 months ago
- MM-Eureka V0 also called R1-Multimodal-Journey, Latest version is in MM-Eureka☆325Jun 21, 2025Updated 9 months ago
- Tongyi Deep Research, the Leading Open-source Deep Research Agent☆18,521Feb 27, 2026Updated 3 weeks ago
- The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.☆2,563Mar 17, 2026Updated last week
- ☆63Jan 3, 2025Updated last year
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆107Sep 18, 2025Updated 6 months ago
- Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcemen…☆644Feb 15, 2026Updated last month
- Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …☆91Nov 15, 2024Updated last year
- [NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…☆124Nov 25, 2024Updated last year
- Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…☆1,373Mar 9, 2026Updated 2 weeks ago
- Cross-Self KV Cache Pruning for Efficient Vision-Language Inference☆10Dec 15, 2024Updated last year
- ZeroSearch: Incentivize the Search Capability of LLMs without Searching☆1,252Aug 16, 2025Updated 7 months ago
- Code for Retrieval-Augmented Perception (ICML 2025)☆69Mar 16, 2026Updated last week
- ☆109Aug 14, 2025Updated 7 months ago