ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities
☆43Jun 7, 2025Updated 8 months ago
Alternatives and similar repositories for oven_eval
Users that are interested in oven_eval are comparing it to the libraries listed below
Sorting:
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆25May 30, 2024Updated last year
- ☆43Aug 15, 2023Updated 2 years ago
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated 2 years ago
- ☆68Oct 27, 2023Updated 2 years ago
- Evaluate robustness of adaptation methods on large vision-language models☆19Aug 23, 2023Updated 2 years ago
- Official Repository for Can Language Models be Instructed to Protect Personal Information?☆13Oct 8, 2023Updated 2 years ago
- ☆13Apr 23, 2025Updated 10 months ago
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆25Nov 23, 2024Updated last year
- ACL 2023 (Findings) End-to-end Cross-lingual Label Project☆14Nov 24, 2023Updated 2 years ago
- Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)☆143Jan 5, 2026Updated 2 months ago
- Official repository for the A-OKVQA dataset☆110May 8, 2024Updated last year
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆45Nov 29, 2023Updated 2 years ago
- Code and data release for the paper "Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Align…☆19Apr 5, 2024Updated last year
- [EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.☆79Jan 19, 2026Updated last month
- An automatic MLLM hallucination detection framework☆19Sep 26, 2023Updated 2 years ago
- [ECCV2024, Oral, Best Paper Finalist] This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation…☆39Feb 24, 2025Updated last year
- Entity-Driven Image Search over Multimodal Web Content (EMNLP 2023)☆26Dec 2, 2023Updated 2 years ago
- [ICCV 2023] With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning.☆19Jun 7, 2024Updated last year
- A MaskGIT port from JAX to PyTorch☆18Jun 18, 2022Updated 3 years ago
- ☆22May 4, 2023Updated 2 years ago
- Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval with Partial Query (ICCV2021)☆20Dec 4, 2021Updated 4 years ago
- Dataset and starting code for visual entailment dataset☆119Apr 21, 2022Updated 3 years ago
- [ICML'24 Oral] "MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions"☆208Oct 28, 2024Updated last year
- Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …☆91Nov 15, 2024Updated last year
- ☆24Jul 8, 2023Updated 2 years ago
- Visual Delta Generator with Large Multi-modal Model for Semi-supervised Composed Image Retrieval - CVPR2024☆21May 30, 2024Updated last year
- Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]☆24Aug 13, 2024Updated last year
- [NeurIPS 2023] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training☆27Dec 5, 2023Updated 2 years ago
- Big-Interleaved-Dataset☆58Jan 21, 2023Updated 3 years ago
- [ICML 2024] "Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models"☆58Sep 3, 2024Updated last year
- ☆27Jul 20, 2024Updated last year
- Official code repo of PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs☆26Jan 14, 2025Updated last year
- Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"☆32Mar 26, 2025Updated 11 months ago
- visual question answering prompting recipes for large vision-language models☆28Sep 14, 2024Updated last year
- Code release for the CVPR'23 paper titled "PartDistillation Learning part from Instance Segmentation"☆60Dec 17, 2023Updated 2 years ago
- [BMVC 2023] Zero-shot Composed Text-Image Retrieval☆55Nov 26, 2024Updated last year
- The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.☆104May 30, 2025Updated 9 months ago
- SimMatchV2: Semi-Supervised Learning with Graph Consistency☆22Dec 26, 2023Updated 2 years ago
- Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"☆29Apr 16, 2024Updated last year