open-vision-language/infoseek

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/open-vision-language/infoseek)

open-vision-language / infoseek

☆78

Alternatives and similar repositories for infoseek

Users that are interested in infoseek are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

edchengg / infoseek_eval
View on GitHub
EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions
☆26May 30, 2024Updated 2 years ago
open-vision-language / oven
View on GitHub
☆47Aug 15, 2023Updated 2 years ago
edchengg / oven_eval
View on GitHub
ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities
☆44Jun 7, 2025Updated last year
PaulLerner / ViQuAE
View on GitHub
Source code and data used in the papers ViQuAE (Lerner et al., SIGIR'22), Multimodal ICT (Lerner et al., ECIR'23) and Cross-modal Retriev…
☆39Dec 19, 2024Updated last year
MrZilinXiao / AutoVER
View on GitHub
[ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.
☆14Mar 2, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
LinWeizheDragon / FLMR
View on GitHub
The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.
☆108May 30, 2025Updated last year
Go2Heart / EchoSight
View on GitHub
[EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.
☆90Jan 19, 2026Updated 6 months ago
cqu-student / Wiki-PRF
View on GitHub
☆19Mar 9, 2026Updated 4 months ago
DataArcTech / RagVL
View on GitHub
Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …
☆92Nov 15, 2024Updated last year
LinWeizheDragon / Retrieval-Augmented-Visual-Question-Answering
View on GitHub
This is the official repository for Retrieval Augmented Visual Question Answering
☆252Dec 19, 2024Updated last year
Jam1ezhang / RankCLIP
View on GitHub
Ranking-Consistent Language-Image Pretraining
☆15Oct 24, 2025Updated 9 months ago
allenai / aokvqa
View on GitHub
Official repository for the A-OKVQA dataset
☆117May 8, 2024Updated 2 years ago
uvavision / SyViC
View on GitHub
[ICCV 2023] Going Beyond Nouns With Vision & Language Models Using Synthetic Data
☆13Sep 30, 2023Updated 2 years ago
mi92 / reverse-image-rag
View on GitHub
☆15Jul 8, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
EvolvingLMMs-Lab / multimodal-search-r1
View on GitHub
[ACL-2026] MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal…
☆470Apr 7, 2026Updated 3 months ago
guilk / KAT
View on GitHub
Research code for "KAT: A Knowledge Augmented Transformer for Vision-and-Language"
☆71Jul 11, 2022Updated 4 years ago
snap-research / MyVLM
View on GitHub
Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries" (ECCV 2024)
☆188Jul 5, 2024Updated 2 years ago
WebQnA / WebQA
View on GitHub
☆68Jan 3, 2025Updated last year
vl-illusion / GVIL
View on GitHub
Code and data for EMNLP 2023 paper "Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?"
☆15Jan 25, 2024Updated 2 years ago
si0wang / ViCrit
View on GitHub
☆24Jun 18, 2025Updated last year
RAIVNLab / sugar-crepe
View on GitHub
[NeurIPS 2023] A faithful benchmark for vision-language compositionality
☆94Feb 13, 2024Updated 2 years ago
RUCAIBox / Event-Bench
View on GitHub
Official code of *Towards Event-oriented Long Video Understanding*
☆12Jul 26, 2024Updated 2 years ago
FreedomIntelligence / TRIM
View on GitHub
We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…
☆22Jan 11, 2026Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
TIGER-AI-Lab / ABC
View on GitHub
ABC: Achieving Better Control of Multimodal Embeddings using VLMs [TMLR2025]
☆20Aug 21, 2025Updated 11 months ago
yic20 / CoMC
View on GitHub
[ICML2024] Official PyTorch implementation of CoMC: Language-Driven Cross-Modal Classifier for Zero-Shot Multi-Label Image Recognition
☆17Jul 9, 2024Updated 2 years ago
TIGER-AI-Lab / UniIR
View on GitHub
Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)
☆183Oct 1, 2024Updated last year
AdelWang / MIGRES
View on GitHub
☆19Jun 14, 2024Updated 2 years ago
dengc2023 / LongDocURL
View on GitHub
☆42Apr 6, 2026Updated 3 months ago
THU-KEG / Event-Level-Knowledge-Editing
View on GitHub
☆12Apr 25, 2024Updated 2 years ago
phiyodr / vqaloader
View on GitHub
PyTorch DataLoader for many VQA datasets
☆15Jan 10, 2023Updated 3 years ago
HanSolo9682 / CounterCurate
View on GitHub
This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.
☆19Jun 27, 2024Updated 2 years ago
kaist-ami / BEAF
View on GitHub
[ECCV’24] Official repository for "BEAF: Observing Before-AFter Changes to Evaluate Hallucination in Vision-language Models"
☆22Mar 26, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
OpenBMB / MoRE
View on GitHub
[SIGIR '26] Mixture-of-Retrieval Experts for Reasoning-Guided Multimodal Knowledge Exploitation
☆43Jul 1, 2026Updated 3 weeks ago
Kordi-Lab / Multi-User-LLM-Agent
View on GitHub
Official code for the paper: "Multi-User Large Language Model Agents"
☆27May 11, 2026Updated 2 months ago
aimagelab / COGT
View on GitHub
[ICLR 2025] Causal Graphical Models for Vision-Language Compositional Understanding
☆10Apr 15, 2025Updated last year
mjeensung / xtr-pytorch
View on GitHub
☆19May 16, 2024Updated 2 years ago
StanfordMIMI / villa
View on GitHub
[ICCV 2023] ViLLA: Fine-grained vision-language representation learning from real-world data
☆45Oct 15, 2023Updated 2 years ago
RUCAIBox / POPE
View on GitHub
The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''
☆266Aug 21, 2025Updated 11 months ago
Yanqing0327 / MLLMs-Augmented
View on GitHub
The official implementation of 《MLLMs-Augmented Visual-Language Representation Learning》
☆31Mar 12, 2024Updated 2 years ago