JarvisUSTC / Awesome-Multimodal-RAG
A curated list of the latest advancements, papers, tools, and datasets for **Multimodal Retrieval-Augmented Generation (RAG)**. Multimodal RAG integrates information retrieval and generation across multiple data modalities (e.g., text, image, video, audio).
☆16Updated 3 months ago
Alternatives and similar repositories for Awesome-Multimodal-RAG
Users that are interested in Awesome-Multimodal-RAG are comparing it to the libraries listed below
Sorting:
- Mosaic IT: Enhancing Instruction Tuning with Data Mosaics☆18Updated 3 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆40Updated last week
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆46Updated 6 months ago
- The benchmark and datasets of the ICML 2024 paper "VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual C…☆15Updated 11 months ago
- m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models☆27Updated last month
- Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆15Updated 2 months ago
- The official repository for the Scientific Paper Idea Proposer (SciPIP)☆63Updated 2 months ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆22Updated 3 months ago
- Official Repo for FoodieQA paper (EMNLP 2024)☆16Updated 5 months ago
- [NeurIPS 2024] Code and Data Repo for Paper "Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning"☆26Updated 11 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 6 months ago
- ☆16Updated 9 months ago
- ☆24Updated last month
- Official code implementation for the ACL 2025 paper: 'Dynamic Scaling of Unit Tests for Code Reward Modeling'☆19Updated this week
- ☆20Updated 2 months ago
- Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision☆13Updated last month
- Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types☆18Updated last month
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆38Updated 7 months ago
- Code for "A Sober Look at Progress in Language Model Reasoning" paper☆45Updated this week
- [ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model☆44Updated 6 months ago
- Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?☆24Updated 2 months ago
- ☆22Updated 10 months ago
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆59Updated 6 months ago
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆25Updated 5 months ago
- ☆12Updated last month
- Official Code Repository for the paper "Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-intensive Tasks…☆37Updated 5 months ago
- ☆42Updated 2 months ago
- ☆27Updated last year
- Preference Learning for LLaVA☆44Updated 6 months ago
- An Easy-to-use Hallucination Detection Framework for LLMs.☆58Updated last year