[EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.
☆79Jan 19, 2026Updated last month
Alternatives and similar repositories for EchoSight
Users that are interested in EchoSight are comparing it to the libraries listed below
Sorting:
- [EMNLP 2024] RaTEScore: A Metric for Radiology Report Generation☆64May 18, 2025Updated 9 months ago
- A simple and flexible PyTorch implementation of Video StableDiffusion (ZeroScope_v2) based on diffusers.☆19Feb 15, 2024Updated 2 years ago
- This is the official repository for Retrieval Augmented Visual Question Answering☆244Dec 19, 2024Updated last year
- [CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".☆54May 25, 2025Updated 9 months ago
- [CVPR 2025] Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering☆54Jul 14, 2025Updated 7 months ago
- [CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant☆178Jul 7, 2025Updated 7 months ago
- ☆68Oct 27, 2023Updated 2 years ago
- [BMVC 2023] Zero-shot Composed Text-Image Retrieval☆55Nov 26, 2024Updated last year
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆25May 30, 2024Updated last year
- code for A Large-scale Dataset for Audio-Language Representation Learning☆14Sep 18, 2024Updated last year
- [ICCV 2025] MRGen: Segmentation Data Engine for Underrepresented MRI Modalities☆38Sep 26, 2025Updated 5 months ago
- Code implementation of RP3D-Diag☆17Nov 25, 2024Updated last year
- ☆27Jul 18, 2025Updated 7 months ago
- The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.☆104May 30, 2025Updated 9 months ago
- coded with and corrected by Google Anti-Gravity☆13Nov 23, 2025Updated 3 months ago
- [ICLR 2025] Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models☆59Jan 22, 2025Updated last year
- [EMNLP 2024 Oral] MatchTime: Towards Automatic Soccer Game Commentary Generation☆94Jan 2, 2025Updated last year
- The official codes for "AutoRG-Brain: Grounded Report Generation for Brain MRI".☆49Jan 6, 2026Updated last month
- ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities☆43Jun 7, 2025Updated 8 months ago
- Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"