JarvisUSTC/Awesome-Multimodal-RAG

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/JarvisUSTC/Awesome-Multimodal-RAG)

JarvisUSTC / Awesome-Multimodal-RAG

A curated list of the latest advancements, papers, tools, and datasets for **Multimodal Retrieval-Augmented Generation (RAG)**. Multimodal RAG integrates information retrieval and generation across multiple data modalities (e.g., text, image, video, audio).

☆53

Alternatives and similar repositories for Awesome-Multimodal-RAG

Users that are interested in Awesome-Multimodal-RAG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

nttmdlab-nlp / VDocRAG
View on GitHub
[CVPR2025] VDocRAG: Retirval-Augmented Generation over Visually-Rich Documents
☆66May 26, 2025Updated last year
ZetangForward / CMD-Context-aware-Model-self-Detoxification
View on GitHub
CMD: a framework for Context-aware Model self-Detoxification (EMNLP2024 Long Paper)
☆17Feb 10, 2025Updated last year
MMDocRAG / MMDocRAG
View on GitHub
The code used to train and run inference with MMDocRAG
☆21Nov 6, 2025Updated 8 months ago
SJTU-IPADS / copier
View on GitHub
Copy as an OS Service
☆27Nov 20, 2025Updated 8 months ago
jfma-USTC / HRDoc
View on GitHub
Dataset and scripts for HRDoc
☆42Jun 21, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
zhiyuns / UNITPathSSL
View on GitHub
Official PyTorch implementation of the TMI paper "Nucleus-aware Self-supervised Pretraining Using Unpaired Image-to-image Translation for…
☆16Mar 13, 2024Updated 2 years ago
dengc2023 / LongDocURL
View on GitHub
☆41Apr 6, 2026Updated 3 months ago
ivattyue / Ada-K
View on GitHub
Official code for the ICLR 2025 paper, "Ada-K Routing: Boosting the Efficiency of MoE-based LLMs"
☆12Mar 1, 2025Updated last year
hwcloud-RAS / SmartHW
View on GitHub
☆13May 16, 2025Updated last year
syr-cn / ReMemR1
View on GitHub
Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents
☆42Apr 13, 2026Updated 3 months ago
riedlerm / multimodal_rag_for_industry
View on GitHub
Implementation and evaluation of multimodal RAG with text and image inputs for industrial applications
☆72Nov 6, 2024Updated last year
allen-li1231 / treehop-rag
View on GitHub
Highly Efficient Query Rewriter for Passage Retrieval in the realm of Retrieval-Augmented Generation (RAG)
☆30May 6, 2025Updated last year
isdaviddong / Linebot-Demo-FaceRecognition
View on GitHub
此Line bot範例為使用 LineBotSDK 建立的『圖片、人臉辨識 Bot』用戶可以傳遞照片給 bot ，它會辨識出照片的內容(圖說)，以及照片中的人、性別和、年紀....
☆22Mar 5, 2019Updated 7 years ago
neighthan / gpu-utils
View on GitHub
Utility functions/scripts for working with GPUs.
☆10Jul 5, 2021Updated 5 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
vec-ai / wikiHow-TIIR
View on GitHub
[ACL 2025] Towards Text-Image Interleaved Retrieval
☆16Sep 3, 2025Updated 10 months ago
alessant / HEE
View on GitHub
☆11Apr 8, 2023Updated 3 years ago
VITA-Group / MAD
View on GitHub
[ICLR 2020] Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma, "I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifie…
☆20Dec 30, 2021Updated 4 years ago
Minhtrna / Pycamo
View on GitHub
Python Camouflage Pattern Generator, GUI available
☆26Apr 22, 2026Updated 2 months ago
Alibaba-NLP / VRAG
View on GitHub
Multimodal Retrieval-augmented Generation Framework Built by Tongyi Lab, Alibaba Group.
☆969Apr 29, 2026Updated 2 months ago
EightEggs / Python-Spiders
View on GitHub
This repository is intended to take down what I learn from a book named Python3网络爬虫开发实战（第2版）.
☆11Mar 29, 2023Updated 3 years ago
OpenMatch / MARVEL
View on GitHub
[ACL 2024 Oral] This is the code repo for our ACL‘24 paper "MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Mo…
☆39Jun 30, 2024Updated 2 years ago
ayanban011 / GraphKD
View on GitHub
[ICDAR 2024] (Best Student Paper🏆) Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation
☆16Sep 6, 2024Updated last year
shihengcan / ICM-matcaffe
View on GitHub
Scene Parsing via Integrated Classification Model and Variance-Based Regularization (Matlab&Caffe), In CVPR 2019
☆12Jun 11, 2019Updated 7 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
zhengxuJosh / Awesome-RAG-Vision
View on GitHub
Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision
☆339Jan 25, 2026Updated 5 months ago
LinWeizheDragon / Retrieval-Augmented-Visual-Question-Answering
View on GitHub
This is the official repository for Retrieval Augmented Visual Question Answering
☆251Dec 19, 2024Updated last year
odysie / thermoelectricsdb
View on GitHub
☆18Oct 25, 2022Updated 3 years ago
xiye17 / TextualExplInContext
View on GitHub
The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning (NeurIPS 2022)
☆16Feb 11, 2023Updated 3 years ago
kyungmnlee / RenyiCL
View on GitHub
Contrastive self-supervised learning using Rényi divergence
☆14Oct 21, 2022Updated 3 years ago
vyomakesh09 / longagent
View on GitHub
LONGAGENT: Scaling Language Models to 128k Context through Multi-Agent Collaboration
☆11Mar 11, 2024Updated 2 years ago
isLinXu / paper-read-notes
View on GitHub
paper-read-notes
☆13Sep 26, 2024Updated last year
SJTU-Storage-Lab / CacheSlide
View on GitHub
☆35Jan 27, 2026Updated 5 months ago
lalithjets / SurgicalGPT
View on GitHub
☆28Feb 7, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
PositionalHidden / PositionalHidden
View on GitHub
To mitigate position bias in LLMs, especially in long-context scenarios, we scale only one dimension of LLMs, reducing position bias and …
☆12Jun 18, 2024Updated 2 years ago
DavidMcDonald1993 / heat
View on GitHub
Reference implementation of the HEAT algorithm described in https://link.springer.com/chapter/10.1007/978-3-030-62362-3_4
☆11Mar 24, 2023Updated 3 years ago
RUCBM / AtomMem
View on GitHub
☆27Mar 31, 2026Updated 3 months ago
mandubian / pytorch-neural-ode
View on GitHub
Experiment with Neural ODE on Pytorch
☆14Aug 9, 2019Updated 6 years ago
Edenzzzz / claude-history-sync
View on GitHub
Synchronizing Claude Code conversations across machines
☆16Jul 3, 2026Updated 2 weeks ago
jiani-huang / RecBenchPlus
View on GitHub
Benchmark dataset for the paper "Towards Next-Generation Recommender Systems: A Benchmark for Personalized Recommendation Assistant with …
☆28May 20, 2025Updated last year
Mungeryang / colqwen3
View on GitHub
The code used to train and run inference with the ColQwen3 model. Welcome to follow and star! ⭐️⭐️⭐️ https://huggingface.co/goodman2001/…
☆15Jul 4, 2026Updated 2 weeks ago