Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision
β333Jan 25, 2026Updated 3 months ago
Alternatives and similar repositories for Awesome-RAG-Vision
Users that are interested in Awesome-RAG-Vision are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The code for the paper "Embracing Collaboration Over Competition: Condensing Multiple Prompts for Visual In-Context Learning" (CVPR'25).β15Sep 25, 2025Updated 7 months ago
- π Awesome list of Retrieval-Augmented Generation (RAG) applications in Generative AI.β1,193May 11, 2026Updated last week
- Enhancing Ultrahigh Resolution Remote Sensing Imagery Analysis With ImageRAG [GRSM]β32Updated this week
- A Survey on Multimodal Retrieval-Augmented Generationβ512Feb 20, 2026Updated 3 months ago
- The code for the paper "Efficient Self-Supervised Video Hashing with Selective State Spaces" (AAAI'25).β23Aug 2, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Noise of Web (NoW) is a challenging noisy correspondence learning (NCL) benchmark containing 100K image-text pairs for robust image-text β¦β16Nov 20, 2025Updated 6 months ago
- β11Jan 19, 2025Updated last year
- The official repo for "Unified Domain Adaptive Semantic Segmentation" οΌIEEE TPAMI 2025οΌβ34Aug 14, 2025Updated 9 months ago
- β41Mar 28, 2024Updated 2 years ago
- [ICCV 2025] HQ-CLIP: Leveraging Large Vision-Language Models to Create High-Quality Image-Text Datasetsβ66Aug 6, 2025Updated 9 months ago
- Heirarchical Navigable Small Worldsβ101Aug 8, 2025Updated 9 months ago
- Pytorch Implementation of LLaVA-ReID: Selective Multi-image Questioner for Interactive Person Re-Identificationβ103Nov 20, 2025Updated 6 months ago
- Fetch arxiv data to LLM-friendly textβ132Feb 18, 2026Updated 3 months ago
- β512Oct 11, 2025Updated 7 months ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Geo-metric A Perceptual Dataset of Distortions on Faces" by Wolski et al., SIGGRAPH Asia 2022.β24Nov 9, 2022Updated 3 years ago
- Large Language Model in Actionβ343Jan 28, 2025Updated last year
- Parsing-free RAG supported by VLMsβ956Dec 7, 2025Updated 5 months ago
- Reading list for multimodal sequence learningβ14Sep 4, 2023Updated 2 years ago
- β58Jan 19, 2025Updated last year
- Im2Haircut: Single-view Strand-based Hair Reconstruction for Human Avatars [ICCV 2025]β52Feb 2, 2026Updated 3 months ago
- This repo contains the code and data of "Graph Matching with Bi-level Noisy Correspondence".β20Jul 28, 2023Updated 2 years ago
- β10Nov 29, 2022Updated 3 years ago
- β31Jul 21, 2025Updated 9 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A high-throughput and memory-efficient inference and serving engine for LLMsβ15Jan 22, 2025Updated last year
- [IEEE TGRS 2025] Be the Change You Want to See: Revisiting Remote Sensing Change Detection Practicesβ36Dec 1, 2025Updated 5 months ago
- A curated list of awesome Multimodal studies.β331May 13, 2026Updated last week
- Semantic Search on Wikipedia with Upstash Vectorβ469Dec 12, 2025Updated 5 months ago
- [AAAI 2024] GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrievalβ20May 10, 2024Updated 2 years ago
- β15Aug 20, 2024Updated last year
- [IJCAI'23] The official Github page of the paper "Diffusion Models for Non-autoregressive Text Generation: A Survey".β33Dec 21, 2023Updated 2 years ago
- The code for the paper "Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval" (WWW'22, Oral).β17Mar 8, 2022Updated 4 years ago
- VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selectionβ26May 31, 2025Updated 11 months ago
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Inverse Rendering Toolkitβ14Feb 24, 2025Updated last year
- [CVPR 2023] Learning a 3D Morphable Face Reflectance Model from Low-cost Dataβ59Aug 7, 2024Updated last year
- [ICCV 2023] DiffusionRet: Generative Text-Video Retrieval with Diffusion Modelβ141Apr 9, 2024Updated 2 years ago
- Collection of papers and repos for multimodal chain-of-thoughtβ89Nov 6, 2024Updated last year
- Multi-sources, Multi-resolution, and Multi-scene dataset for Optical-SAR image matchingβ47Oct 14, 2025Updated 7 months ago
- Elaina is a wavefront implementation of walk on stars. (Code for SIGGRAPH 2025 paper "Guiding-Based Importance Sampling for Walk on Starsβ¦β28Oct 7, 2025Updated 7 months ago
- Multimodal RAG using LlamaIndex, Qdrant, llama.cpp for document QA with local VisonLLM and embedding modelsβ18Nov 8, 2024Updated last year