cambridgeltl / visual-spatial-reasoningLinks
[TACL'23] VSR: A probing benchmark for spatial undersranding of vision-language models.
☆127Updated 2 years ago
Alternatives and similar repositories for visual-spatial-reasoning
Users that are interested in visual-spatial-reasoning are comparing it to the libraries listed below
Sorting:
- ☆68Updated 2 years ago
- Official repository for the A-OKVQA dataset☆96Updated last year
- Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022☆30Updated 2 years ago
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆115Updated 2 years ago
- Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight☆37Updated 2 years ago
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning