[NeurIPS'25] ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness
☆31Sep 27, 2025Updated 5 months ago
Alternatives and similar repositories for ColorBench
Users that are interested in ColorBench are comparing it to the libraries listed below
Sorting:
- [ACL 2025 Findings] Implicit Reasoning in Transformers is Reasoning through Shortcuts☆17Mar 11, 2025Updated 11 months ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆19Mar 10, 2025Updated 11 months ago
- A unified robotic manipulation learning framework☆21Sep 4, 2025Updated 5 months ago
- ☆19Nov 7, 2024Updated last year
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆20Apr 9, 2025Updated 10 months ago
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆24Sep 26, 2024Updated last year
- Implementation of layer diffuse inference using refiners☆25Apr 25, 2024Updated last year
- [ECCV'24] MaxFusion: Plug & Play multimodal generation in text to image diffusion models☆27Nov 2, 2024Updated last year
- [IJCV 2026] HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts☆26Feb 28, 2025Updated last year
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"☆31Dec 23, 2024Updated last year
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- A Text2SQL benchmark for evaluation of Large Language Models☆41Updated this week
- Official repository of "CoMP: Continual Multimodal Pre-training for Vision Foundation Models"☆45Apr 3, 2025Updated 10 months ago
- ☆73Jul 14, 2024Updated last year
- [ICLR 2025] Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision☆72Jul 10, 2024Updated last year
- A Lightweight Visual Reasoning Benchmark for Evaluating Large Multimodal Models through Complex Diagrams in Coding Tasks☆14Feb 25, 2025Updated last year
- [NeurIPS ENLSP Workshop'24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios☆16Oct 18, 2024Updated last year
- ☆18Jun 10, 2025Updated 8 months ago
- Official PyTorch implementation of "No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding"☆32May 20, 2024Updated last year
- MobileLLM-R1☆75Sep 30, 2025Updated 5 months ago
- ☆41Jun 9, 2025Updated 8 months ago
- [CVPR2025] VDocRAG: Retirval-Augmented Generation over Visually-Rich Documents☆59May 26, 2025Updated 9 months ago
- ☆37Oct 17, 2025Updated 4 months ago
- [ICCV 2025] Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges☆83Feb 27, 2025Updated last year
- [ICLR 2025 Oral] "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆90Oct 15, 2024Updated last year
- ☆10May 26, 2025Updated 9 months ago
- The official implement of paper 《DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents》☆29Oct 23, 2025Updated 4 months ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- A Framework for Evaluating AI Agent Safety in Realistic Environments☆30Oct 2, 2025Updated 4 months ago
- ☆11Jun 22, 2025Updated 8 months ago
- Symphony — A decentralized multi-agent framework that enables intelligent agents to collaborate seamlessly across heterogeneous edge devi…☆30Oct 30, 2025Updated 4 months ago
- A time delay estimation method for event-based time-series data. Time delay estimation is also known as the correction of time offsets an…☆15Dec 3, 2025Updated 2 months ago
- [ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs☆30Updated this week
- Truncate datetime objects to the specifiec level of precision, inspired by PostgreSQL's DATE_TRUNC.☆14Apr 20, 2021Updated 4 years ago
- [NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆53Sep 29, 2025Updated 5 months ago
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆42Aug 4, 2024Updated last year
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 6 months ago
- Codes for Difflare: Removing Image Flare with Latent Diffusion Models☆11Dec 24, 2024Updated last year
- ☆25Aug 19, 2025Updated 6 months ago