Alibaba-NLP / VRAGLinks
Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning"
☆280Updated 2 weeks ago
Alternatives and similar repositories for VRAG
Users that are interested in VRAG are comparing it to the libraries listed below
Sorting:
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆349Updated 2 months ago
- MMR1: Advancing the Frontiers of Multimodal Reasoning☆162Updated 4 months ago
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆241Updated 2 weeks ago
- R1-onevision, a visual language model capable of deep CoT reasoning.☆541Updated 3 months ago
- Collect every awesome work about r1!☆394Updated 2 months ago
- PC Agent: While You Sleep, AI Works - A Cognitive Journey into Digital World☆269Updated last month
- GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation☆242Updated last week
- MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding☆189Updated 3 months ago
- The development and future prospects of multimodal reasoning models.☆431Updated last week
- ☆173Updated 5 months ago
- GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.☆787Updated this week
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆244Updated 4 months ago
- [ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval☆203Updated last month
- ☆629Updated last week
- Agentic RAG R1 Framework via Reinforcement Learning☆252Updated last month
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines☆125Updated 8 months ago
- Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision☆194Updated 2 weeks ago
- Qwen DianJin: LLMs for the Financial Industry by Alibaba Cloud☆118Updated last month
- FlexRAG: A RAG Framework for Information Retrieval and Generation.☆194Updated last month
- Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆201Updated this week
- A Survey on Multimodal Retrieval-Augmented Generation☆260Updated this week
- ☆85Updated last year
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.☆503Updated 3 months ago
- This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension"☆208Updated this week
- This is a user guide for the MiniCPM and MiniCPM-V series of small language models (SLMs) developed by ModelBest. “面壁小钢炮” focuses on achi…☆255Updated 2 weeks ago
- ☆266Updated last month
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning☆588Updated last month
- ☆270Updated last month
- ☆366Updated 5 months ago
- [CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness☆385Updated 2 months ago