Alibaba-NLP / VRAGLinks
Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning"
☆156Updated this week
Alternatives and similar repositories for VRAG
Users that are interested in VRAG are comparing it to the libraries listed below
Sorting:
- FlexRAG: A RAG Framework for Information Retrieval and Generation.☆169Updated this week
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆329Updated last month
- Search, organize, discover anything!☆48Updated last year
- GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation☆191Updated this week
- Qwen DianJin: LLMs for the Financial Industry by Alibaba Cloud☆106Updated 2 weeks ago
- MMR1: Advancing the Frontiers of Multimodal Reasoning☆159Updated 2 months ago
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆236Updated 3 months ago
- Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision☆168Updated this week
- ☆173Updated 4 months ago
- MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding☆161Updated 2 months ago
- 最简易的R1结果在小模型上的复现,阐述类O1与DeepSeek R1最重要的本质。Think is all your need。利用实验佐证,对于强推理能力,think思考过程性内容是AGI/ASI的核心。☆45Updated 3 months ago
- Fine-Tuning Dataset Auto-Generation for Graph Query Languages.☆53Updated last week
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines☆125Updated 7 months ago
- Agentic RAG R1 Framework via Reinforcement Learning☆191Updated 2 weeks ago
- [ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models☆63Updated 2 weeks ago
- ☆269Updated last week
- ☆210Updated last week
- MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval☆183Updated 2 weeks ago
- ☆151Updated last month
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆103Updated 2 weeks ago
- ☆223Updated last week
- Collect every awesome work about r1!☆376Updated last month
- ☆222Updated last year
- [ICLR 2025] The official implementation of paper "ToolGen: Unified Tool Retrieval and Calling via Generation"☆142Updated 2 months ago
- ☆81Updated this week
- Official code implementation of Slow Perception:Let's Perceive Geometric Figures Step-by-step☆128Updated 3 months ago
- An open platform for enhancing the capability of LLMs in workflow orchestration.☆146Updated 2 months ago
- The official code for NeurIPS 2024 paper: Harmonizing Visual Text Comprehension and Generation☆126Updated 6 months ago
- ☆85Updated last year
- GLM Series Edge Models☆142Updated 3 months ago