JarvisUSTC / Awesome-Multimodal-RAGLinks
A curated list of the latest advancements, papers, tools, and datasets for **Multimodal Retrieval-Augmented Generation (RAG)**. Multimodal RAG integrates information retrieval and generation across multiple data modalities (e.g., text, image, video, audio).
☆19Updated 4 months ago
Alternatives and similar repositories for Awesome-Multimodal-RAG
Users that are interested in Awesome-Multimodal-RAG are comparing it to the libraries listed below
Sorting:
- The benchmark and datasets of the ICML 2024 paper "VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual C…☆15Updated last year
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆66Updated 6 months ago
- [ICLR 2025] Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality☆31Updated last month
- [ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model☆46Updated 6 months ago
- ☆31Updated last year
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆43Updated 3 weeks ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆46Updated last week
- Mosaic IT: Enhancing Instruction Tuning with Data Mosaics☆18Updated 3 months ago
- Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?☆23Updated 2 months ago
- [NeurIPS 2024] Code and Data Repo for Paper "Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning"☆26Updated last year
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆22Updated 3 months ago
- ☆16Updated 10 months ago
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆25Updated 6 months ago
- Preference Learning for LLaVA☆45Updated 6 months ago
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆78Updated 5 months ago
- A trainable user simulator☆34Updated 8 months ago
- ☆54Updated 2 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆38Updated last year
- ☆13Updated last year
- ☆105Updated 2 months ago
- Official Repo for FoodieQA paper (EMNLP 2024)☆16Updated 6 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 7 months ago
- The official repository for the Scientific Paper Idea Proposer (SciPIP)☆62Updated 3 months ago
- [ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning☆44Updated 10 months ago
- Aligning Vision to Language: Text-Free Multimodal Knowledge Graph Construction for Enhanced LLMs Reasoning☆25Updated last week
- ☆60Updated 2 weeks ago
- The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".☆27Updated 2 months ago
- MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion (ACL 2025)☆22Updated last week
- 超简单复现Deepseek-R1-Zero和Deepseek-R1,以「24点游戏」为例。通过zero-RL、SFT以及SFT+RL,以激发LLM的自主验证反思能力。 About Clean, minimal, accessible reproduction of Dee…☆19Updated 2 months ago
- ☆29Updated 8 months ago