☆27Jul 20, 2024Updated last year
Alternatives and similar repositories for ConTextual
Users that are interested in ConTextual are comparing it to the libraries listed below
Sorting:
- Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"☆19Oct 4, 2022Updated 3 years ago
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- ☆13Jul 2, 2025Updated 7 months ago
- A spoken version of the textual story cloze benchmark☆20Aug 6, 2023Updated 2 years ago
- LLM evaluation.☆16Nov 7, 2023Updated 2 years ago
- [ECCV2024, Oral, Best Paper Finalist] This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation…☆39Feb 24, 2025Updated last year
- (CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.☆360Jan 14, 2025Updated last year
- ☆50Oct 29, 2023Updated 2 years ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆64Oct 19, 2024Updated last year
- Official implementation of the TransT-M (the winner of VOT-RT 2021) , including code and models.☆26Mar 28, 2023Updated 2 years ago
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆25May 30, 2024Updated last year
- ☆27Mar 21, 2024Updated last year
- ☆29Oct 18, 2022Updated 3 years ago
- ☆27Aug 28, 2023Updated 2 years ago
- ☆32Feb 8, 2024Updated 2 years ago
- A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models☆72Feb 25, 2025Updated last year
- ☆30May 9, 2024Updated last year
- Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022☆31May 29, 2023Updated 2 years ago
- ☆29Jan 23, 2024Updated 2 years ago
- Official implementation of "Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data" (ICLR 2024)☆34Oct 16, 2024Updated last year
- Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight☆37May 23, 2023Updated 2 years ago
- Touchstone: Evaluating Vision-Language Models by Language Models☆83Jan 18, 2024Updated 2 years ago
- On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)☆796Jul 5, 2025Updated 7 months ago
- CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics☆27Nov 1, 2025Updated 4 months ago
- ☆85Aug 18, 2024Updated last year
- [CVPR 2020] A generative model with latent factors that are independent and localized.☆12Mar 27, 2025Updated 11 months ago
- [ECCV 2024] BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models☆86Aug 19, 2024Updated last year
- ☆115May 7, 2025Updated 9 months ago
- Create generated datasets and train robust classifiers☆36Sep 1, 2023Updated 2 years ago
- MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts☆355Sep 29, 2025Updated 5 months ago
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆86Oct 26, 2025Updated 4 months ago
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- ☆11May 24, 2024Updated last year
- 🧠🖼️🐍 A Python wrapper around the BrainFrame REST API☆12Jan 7, 2025Updated last year
- A Swedish Natural Language Understanding Benchmark☆11Dec 12, 2025Updated 2 months ago
- [CVPR2024] Learning from Synthetic Human Group Activities☆14Feb 24, 2025Updated last year
- ☆12Jan 11, 2026Updated last month
- A mapping tool for vehicle routing problem datasets and solutions☆11Feb 12, 2020Updated 6 years ago
- A framework for few-shot evaluation of autoregressive language models.☆12Jul 14, 2025Updated 7 months ago