g-luo / geolocation_via_guidebook_groundingLinks

G^3: Geolocation via Guidebook Grounding, Findings of EMNLP 2022

☆17

Alternatives and similar repositories for geolocation_via_guidebook_grounding

Users that are interested in geolocation_via_guidebook_grounding are comparing it to the libraries listed below

Sorting:

keio-smilab24 / Polos
[CVPR24 Highlights] Polos: Multimodal Metric Learning from Human Feedback for Image Captioning
☆32Updated 6 months ago
redcaps-dataset / redcaps-downloader
Command-line tool for downloading and extending the RedCaps dataset.
☆50Updated last year
ajd12342 / why-winoground-hard
Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022
☆31Updated 2 years ago
edchengg / oven_eval
ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities
☆43Updated 5 months ago
fawazsammani / nlxgpt
NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022 (Oral)
☆49Updated last year
RAIVNLab / sugar-crepe
[NeurIPS 2023] A faithful benchmark for vision-language compositionality
☆88Updated last year
amitakamath / whatsup_vlms
Code and datasets for "What’s “up” with vision-language models? Investigating their struggle with spatial reasoning".
☆66Updated last year
lupantech / IconQA
Data and code for NeurIPS 2021 Paper "IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning".
☆53Updated last year
IntelLabs / VL-InterpreT
Visual Language Transformer Interpreter - An interactive visualization tool for interpreting vision-language transformers
☆98Updated 2 years ago
NVlabs / PALAVRA
☆53Updated 3 years ago
google-deepmind / svo_probes
The SVO-Probes Dataset for Verb Understanding
☆31Updated 3 years ago
gregor-ge / FOCI-Benchmark
We present **FOCI**, a benchmark for Fine-grained Object ClassIfication for large vision language models (LVLMs).
☆18Updated last year
yonatanbitton / wysiwyr
☆37Updated 2 years ago
goel-shashank / CyCLIP
☆120Updated 2 years ago
allenai / swig
Situation With Groundings (SWiG) dataset and Joint Situation Localizer (JSL)
☆69Updated 4 years ago
mertyg / vision-language-models-are-bows
Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …
☆286Updated 2 years ago
vl-illusion / GVIL
Code and data for EMNLP 2023 paper "Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?"
☆14Updated last year
allenai / sherlock
Code, data, models for the Sherlock corpus
☆58Updated 3 years ago
aimagelab / awesome-captioning-evaluation
[IJCAI 2025] Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives
☆22Updated last week
limanling / KnowledgeVL-Reading
☆67Updated 2 years ago
NewsStoriesData / newsstories.github.io
☆22Updated 3 years ago
princetonvisualai / pointingqa
Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"
☆19Updated 3 years ago
yuhui-zh15 / drml
Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)
☆34Updated 2 years ago
open-vision-language / oven
☆40Updated 2 years ago
adobe-research / vaw_dataset
This repository provides data for the VAW dataset as described in the CVPR 2021 paper titled "Learning to Predict Visual Attributes in th…
☆68Updated 3 years ago
kevinzakka / clip_playground
An ever-growing playground of notebooks showcasing CLIP's impressive zero-shot capabilities
☆175Updated 3 years ago
ys-zong / FoolyourVLLMs
[ICML 2024] Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations
☆15Updated 2 years ago
DavidHuji / CapDec
CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)
☆202Updated last year
DavidMChan / caption-by-committee
Using LLMs and pre-trained caption models for super-human performance on image captioning.
☆42Updated 2 years ago
allenai / gpv-1
A task-agnostic vision-language architecture as a step towards General Purpose Vision
☆92Updated 4 years ago