facebookresearch / unibenchLinks

Python Library to evaluate VLM models' robustness across diverse benchmarks

☆210

Alternatives and similar repositories for unibench

Users that are interested in unibench are comparing it to the libraries listed below

Sorting:

zeyofu / BLINK_Benchmark
This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.or…
☆133Updated last year
mu-cai / matryoshka-mm
Matryoshka Multimodal Models
☆112Updated 6 months ago
tulip-berkeley / open_clip
An open source implementation of CLIP (With TULIP Support)
☆162Updated 2 months ago
bronyayang / Law_of_Vision_Representation_in_MLLMs
Official implementation of the Law of Vision Representation in MLLMs
☆163Updated 8 months ago
LeslieTrue / SFTvsRL
Official implementation of paper: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
☆289Updated 3 months ago
TRI-ML / vlm-evaluation
VLM Evaluation: Benchmark for VLMs, spanning text generation tasks from VQA to Captioning
☆120Updated 10 months ago
Hon-Wong / VoRA
[Fully open] [Encoder-free MLLM] Vision as LoRA
☆322Updated last month
aimagelab / LLaVA-MORE
LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning
☆141Updated 2 weeks ago
facebookresearch / webssl
Code for "Scaling Language-Free Visual Representation Learning" paper (Web-SSL).
☆171Updated 3 months ago
TIGER-AI-Lab / Mantis
Official code for Paper "Mantis: Multi-Image Instruction Tuning" [TMLR 2024]
☆223Updated 4 months ago
Understanding-Visual-Datasets / VisDiff
Official implementation of "Describing Differences in Image Sets with Natural Language" (CVPR 2024 Oral)
☆120Updated last year
EvolvingLMMs-Lab / multimodal-sae
[ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.
☆146Updated 3 weeks ago
TIGER-AI-Lab / MEGA-Bench
This repo contains the code for "MEGA-Bench Scaling Multimodal Evaluation to over 500 Real-World Tasks" [ICLR2025]
☆73Updated last month
kongds / E5-V
E5-V: Universal Embeddings with Multimodal Large Language Models
☆262Updated 7 months ago
FreedomIntelligence / LongLLaVA
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture
☆207Updated 7 months ago
UCSC-VLAA / Recap-DataComp-1B
[ICML 2025] This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"
☆138Updated last year
reka-ai / reka-vibe-eval
Multimodal language model benchmark, featuring challenging examples
☆173Updated 7 months ago
SHI-Labs / CuMo
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
☆152Updated last year
ByungKwanLee / Meteor
[NeurIPS 2024] Official PyTorch implementation code for realizing the technical part of Mamba-based traversal of rationale (Meteor) to im…
☆115Updated last year
Yangyi-Chen / SOLO
[TMLR] Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"
☆144Updated 8 months ago
orrzohar / Video-STaR
[ICLR 2025] Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision
☆66Updated last year
allenai / unified-io-2.pytorch
☆76Updated last year
kaiyuyue / nxtp
[CVPR'24 Highlight] PyTorch Implementation of Object Recognition as Next Token Prediction
☆180Updated 3 months ago
apple / ml-veclip
The official repo for the paper "VeCLIP: Improving CLIP Training via Visual-enriched Captions"
☆246Updated 6 months ago
visual-haystacks / vhs_benchmark
🔥 [ICLR 2025] Official Benchmark Toolkits for "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"
☆29Updated 5 months ago
declare-lab / LLM-PuzzleTest
This repository is maintained to release dataset and models for multimodal puzzle reasoning.
☆99Updated 5 months ago
chenllliang / DnD-Transformer
[ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegr…
☆76Updated 7 months ago
bfshi / scaling_on_scales
When do we not need larger vision models?
☆404Updated 5 months ago
TIGER-AI-Lab / VL-Rethinker
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning"
☆134Updated 2 months ago
NVlabs / LITA
☆179Updated 9 months ago