zhengxuJosh / Awesome-RAG-VisionLinks

Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision

☆264

Alternatives and similar repositories for Awesome-RAG-Vision

Users that are interested in Awesome-RAG-Vision are comparing it to the libraries listed below

Sorting:

HITsz-TMG / Awesome-Large-Multimodal-Reasoning-Models
The development and future prospects of large multimodal reasoning models.
☆545Updated 3 months ago
HJYao00 / Awesome-Reasoning-MLLM
Awesome Reasoning in MLLMs: Papers and Projects about learning to reason with MLLMs, including Chain-of-Thought (CoT), OpenAl o1, and Dee…
☆57Updated 8 months ago
Alibaba-NLP / VRAG
Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforce…
☆397Updated last month
Fancy-MLLM / R1-Onevision
R1-onevision, a visual language model capable of deep CoT reasoning.
☆570Updated 7 months ago
thunlp / Migician
[ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models
☆81Updated 6 months ago
jinbo0906 / Awesome-MLLM-Datasets
This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training …
☆59Updated 6 months ago
HC-Guo / Awesome-Multimodal-Chain-of-Thought
Collection of papers and repos for multimodal chain-of-thought
☆89Updated last year
JoeLeelyf / customize-arxiv-daily
Customize your arXiv recommendation every day.
☆134Updated last month
Leon1207 / Video-RAG-master
✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehensi…
☆336Updated 3 weeks ago
yeliudev / VideoMind
💡 VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning
☆277Updated last month
modelscope / awesome-deep-reasoning
Collect every awesome work about r1!
☆421Updated 6 months ago
EvolvingLMMs-Lab / multimodal-search-r1
MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…
☆348Updated 2 months ago
Alibaba-NLP / OmniSearch
Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
☆393Updated 6 months ago
cnzzx / VSA
Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines
☆126Updated last year
ictnlp / FlexRAG
FlexRAG: A RAG Framework for Information Retrieval and Generation.
☆226Updated 5 months ago
aiming-lab / MDocAgent
MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding
☆248Updated 3 months ago
gszfwsb / NCFM
Official PyTorch implementation of the paper "Dataset Distillation with Neural Characteristic Function: A Minmax Perspective" (NCFM) in C…
☆389Updated last month
JackYFL / awesome-VLLMs
This repository collects papers on VLLM applications. We will update new papers irregularly.
☆177Updated 2 months ago
apple / ml-slowfast-llava
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
☆283Updated last year
zjysteven / lmms-finetune
A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision,…
☆353Updated 3 weeks ago
yaotingwangofficial / Awesome-MCoT
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
☆888Updated this week
CaraJ7 / MMSearch
[ICLR 2025] The First Multimodal Seach Engine Pipeline and Benchmark for LMMs
☆479Updated 9 months ago
MM-LLMs / mm-llms.github.io
☆33Updated 10 months ago
llm-lab-org / Multimodal-RAG-Survey
A Survey on Multimodal Retrieval-Augmented Generation
☆421Updated last week
zhaochen0110 / OpenThinkIMG
OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.
☆327Updated 5 months ago
bytedance / Valley
Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.
☆256Updated 2 weeks ago
Visual-Agent / DeepEyes
☆971Updated 3 weeks ago
HCPLab-SYSU / Book-of-MLM
《多模态大模型：新一代人工智能技术范式》作者：刘阳，林倞
☆253Updated 11 months ago
WePOINTS / WePOINTS
☆186Updated 9 months ago
LightChen233 / Awesome-Long-Chain-of-Thought-Reasoning
Latest Advances on Long Chain-of-Thought Reasoning
☆554Updated 4 months ago