reka-ai/reka-vibe-eval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/reka-ai/reka-vibe-eval)

reka-ai / reka-vibe-eval

Multimodal language model benchmark, featuring challenging examples

☆189

Alternatives and similar repositories for reka-vibe-eval

Users that are interested in reka-vibe-eval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Annusha / xmic
View on GitHub
X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization, CVPR 2024
☆11Nov 7, 2024Updated last year
lscpku / VITATECS
View on GitHub
☆18Jul 10, 2024Updated 2 years ago
arijitray1993 / COLA
View on GitHub
COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!
☆25May 14, 2026Updated 2 months ago
locuslab / T-MARS
View on GitHub
Code for T-MARS data filtering
☆35Aug 23, 2023Updated 2 years ago
RAIVNLab / CREPE
View on GitHub
[CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?
☆35Apr 27, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
goel-shashank / CyCLIP
View on GitHub
☆126Feb 21, 2023Updated 3 years ago
huggingface / m4-logs
View on GitHub
M4 experiment logbook
☆59Aug 21, 2023Updated 2 years ago
mlfoundations / dataset2metadata
View on GitHub
☆28Mar 21, 2024Updated 2 years ago
amitakamath / whatsup_vlms
View on GitHub
Code and datasets for "What’s “up” with vision-language models? Investigating their struggle with spatial reasoning".
☆71Feb 28, 2024Updated 2 years ago
yuweihao / MM-Vet
View on GitHub
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)
☆329Jan 20, 2025Updated last year
filipgdorm / eco-llm
View on GitHub
☆14Mar 20, 2026Updated 4 months ago
FreedomIntelligence / ALLaVA
View on GitHub
Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model
☆281Jun 25, 2024Updated 2 years ago
mlfoundations / patching
View on GitHub
Patching open-vocabulary models by interpolating weights
☆91Sep 28, 2023Updated 2 years ago
mlfoundations / clip_quality_not_quantity
View on GitHub
☆28Oct 18, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
yale-nlp / TOMATO
View on GitHub
☆41Nov 8, 2024Updated last year
cloneofsimo / repa-rf
View on GitHub
☆32Nov 4, 2024Updated last year
jzhang38 / EasyContext
View on GitHub
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
☆759Sep 27, 2024Updated last year
MAmmoTH-VL / MAmmoTH-VL
View on GitHub
(ACL 2025) MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale
☆50Jun 4, 2025Updated last year
allenai / faithful-nmn
View on GitHub
Evaluating and improving the faithfulness of the interpretations offered by Neural Module Networks
☆13Jun 12, 2023Updated 3 years ago
bfshi / scaling_on_scales
View on GitHub
When do we not need larger vision models?
☆420Feb 8, 2025Updated last year
PKU-YuanGroup / Video-Bench
View on GitHub
A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models!
☆140Dec 31, 2023Updated 2 years ago
FuxiaoLiu / LRV-Instruction
View on GitHub
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
☆297Mar 13, 2024Updated 2 years ago
dwzhu-pku / PoSE
View on GitHub
Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)
☆208May 20, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
mlfoundations / datacomp
View on GitHub
DataComp: In search of the next generation of multimodal datasets
☆787Apr 28, 2025Updated last year
allenai / gpv2-web10k
View on GitHub
Download Web-10K data by querying Bing Image Search
☆10Feb 1, 2022Updated 4 years ago
shoaibahmed / metadata_archaeology
View on GitHub
Official code for the paper: "Metadata Archaeology"
☆19May 10, 2023Updated 3 years ago
ilkerkesen / ViLMA
View on GitHub
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)
☆16Jan 18, 2024Updated 2 years ago
vivoutlaw / tcbp
View on GitHub
Temporal Compact Bilinear Pooling (TCBP)
☆11May 27, 2020Updated 6 years ago
facebookresearch / MetaCLIP
View on GitHub
NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024
☆1,847Nov 27, 2025Updated 7 months ago
elisakreiss / concadia
View on GitHub
☆16Jan 3, 2023Updated 3 years ago
allenai / unified-io-2
View on GitHub
☆650Feb 15, 2024Updated 2 years ago
ninatu / howtocaption
View on GitHub
Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024
☆58Aug 19, 2025Updated 11 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
oxai / visogender
View on GitHub
☆13May 10, 2025Updated last year
facebookresearch / unibench
View on GitHub
Python Library to evaluate VLM models' robustness across diverse benchmarks
☆227Jun 30, 2026Updated 3 weeks ago
danielchyeh / this-is-my
View on GitHub
Official This-Is-My Dataset published in CVPR 2023
☆16Jul 18, 2024Updated 2 years ago
RAIVNLab / sugar-crepe
View on GitHub
[NeurIPS 2023] A faithful benchmark for vision-language compositionality
☆93Feb 13, 2024Updated 2 years ago
ajd12342 / why-winoground-hard
View on GitHub
Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022
☆31May 29, 2023Updated 3 years ago
google-deepmind / perception_test
View on GitHub
☆253Jun 19, 2026Updated last month
arubique / OCCAM
View on GitHub
This is an implementation of the paper "Are We Done with Object-Centric Learning?"
☆13Jun 21, 2026Updated last month