A curated list of the latest advancements, papers, tools, and datasets for **Multimodal Retrieval-Augmented Generation (RAG)**. Multimodal RAG integrates information retrieval and generation across multiple data modalities (e.g., text, image, video, audio).
☆53Nov 25, 2025Updated 6 months ago
Alternatives and similar repositories for Awesome-Multimodal-RAG
Users that are interested in Awesome-Multimodal-RAG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents☆40Apr 13, 2026Updated last month
- The implementation of SSTAN in SUN-SEG dataset. (Semi-supervised Spatial Temporal Attention Network for Video Polyp Segmentation, MICCAI …☆13Jul 25, 2024Updated last year
- 在RAG技术中,嵌入向量的生成和匹配是关键环节。本文介绍了一种基于CLIP/BLIP模型的嵌入服务,该服务支持文本和图像的嵌入生成与相似度计算,为多模态信息检索提供了基础能力。☆42Dec 28, 2024Updated last year
- Implementation and evaluation of multimodal RAG with text and image inputs for industrial applications☆71Nov 6, 2024Updated last year
- [SIGIR '26] Mixture-of-Retrieval Experts for Reasoning-Guided Multimodal Knowledge Exploitation☆41May 15, 2026Updated 2 weeks ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Official repo for "TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series"☆28May 14, 2025Updated last year
- LONGAGENT: Scaling Language Models to 128k Context through Multi-Agent Collaboration☆11Mar 11, 2024Updated 2 years ago
- ☆11Apr 8, 2023Updated 3 years ago
- The collections of MOE (Mixture Of Expert) papers, code and tools, etc.☆12Mar 15, 2024Updated 2 years ago
- This repository is intended to take down what I learn from a book named Python3网络爬虫开发实战(第2版).☆11Mar 29, 2023Updated 3 years ago
- 在index-tts-vllm的基础上,实现了并提供了模拟流式合成音频的接口服务及客户端测试脚本☆26Sep 2, 2025Updated 8 months ago
- An open source implementation of R1☆31May 18, 2026Updated last week
- ☆10Oct 29, 2020Updated 5 years ago
- The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning (NeurIPS 2022)☆16Feb 11, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Utility functions/scripts for working with GPUs.☆10Jul 5, 2021Updated 4 years ago
- To mitigate position bias in LLMs, especially in long-context scenarios, we scale only one dimension of LLMs, reducing position bias and …☆11Jun 18, 2024Updated last year
- Official repository for the paper "Reconstruction of Perceived Images from fMRI Patterns and Semantic Brain Exploration using Instance-Co…☆24May 19, 2022Updated 4 years ago
- ACL 2026 & NAACL 2025: Bridging Retrieval and Inference through Evidence Fusion☆13Apr 9, 2026Updated last month
- ☆12Jan 10, 2025Updated last year
- Just a simple Android app that uses Rokid's CXR-M SDK to upload/sideload an APK onto your Rokid glasses over Wi-Fi. It might be hard to g…☆45Apr 9, 2026Updated last month
- ☆12Jun 21, 2020Updated 5 years ago
- [NeurIPS 2025] Search and Refine During Think: Facilitating Knowledge Refinement for Improved Retrieval-Augmented Reasoning☆137Dec 13, 2025Updated 5 months ago
- Happy Hacking With Claude!!!☆25Oct 27, 2025Updated 7 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- 增加了indextts2的简单的界面与api调用方式☆27Oct 27, 2025Updated 7 months ago
- [ICDAR 2024] (Best Student Paper🏆) Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation☆15Sep 6, 2024Updated last year
- Toolkit to help you do better research☆11Apr 19, 2019Updated 7 years ago
- Code for "Neural Network-based Reconstruction in Compressed Sensing MRI Without Fully-sampled Training Data"☆12Jan 5, 2021Updated 5 years ago
- ☆12Apr 11, 2019Updated 7 years ago
- Evaluate state-of-the-art sparse embedding models on the LIMIT dataset (`limit-small` and `limit`) from google's paper `On the Theoretica…☆16Sep 4, 2025Updated 8 months ago
- Code for NeurIPS 2019 paper "From voxels to pixels and back: Self-supervision in natural-image reconstruction from fMRI"☆42Apr 24, 2022Updated 4 years ago
- AI驱动的虚拟数字人直播系统,支持2D/3D数字人、TTS、ASR、唇形同步、推流、互动等模块化开发。☆25May 13, 2025Updated last year
- [ICLR 2025] Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models☆62Jan 22, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- repo for paper: Adaptive Checkpoint Adjoint (ACA) method for gradient estimation in neural ODE☆56Mar 13, 2021Updated 5 years ago
- Controlled Online Optimization Learning (COOL): Finding the Ground State of Spin Hamiltonians with Reinforcement Learning (arXiv:2003.000…☆13Jun 18, 2020Updated 5 years ago
- ppt转数字人后台☆20Apr 9, 2025Updated last year
- Code for ISBI'19 Tutorial☆36Jan 6, 2026Updated 4 months ago
- ☆25Sep 1, 2025Updated 8 months ago
- [ACL2026 Findings] "Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models"☆20Mar 25, 2025Updated last year
- [CVPR 2022] Official PyTorch implementation for Attributable Visual Similarity Learning☆34Oct 17, 2022Updated 3 years ago