BAAI-DCAI / MMVUView external linksLinks
☆53Mar 19, 2025Updated 10 months ago
Alternatives and similar repositories for MMVU
Users that are interested in MMVU are comparing it to the libraries listed below
Sorting:
- ☆16Apr 28, 2024Updated last year
- ☆20Jan 1, 2026Updated last month
- ☆48Sep 5, 2024Updated last year
- A collection of visual instruction tuning datasets.☆76Mar 14, 2024Updated last year
- A family of lightweight multimodal models.☆1,051Nov 18, 2024Updated last year
- Official code of *Towards Event-oriented Long Video Understanding*☆12Jul 26, 2024Updated last year
- Code and data for EMNLP 2023 paper "Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?"☆15Jan 25, 2024Updated 2 years ago
- [AAAI 2025] SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization☆29Jan 3, 2025Updated last year
- A Comprehensive Benchmark for Robust Multi-image Understanding☆18Sep 4, 2024Updated last year
- ☆16Oct 21, 2024Updated last year
- 🔥🔥MLVU: Multi-task Long Video Understanding Benchmark☆241Aug 21, 2025Updated 5 months ago
- [WIP@Oct 13] 质衡-基准测试 (Q-Bench in Chinese),包含中文版【底层视觉问答】和【底层视觉描述】数据集,以及中文提示下的图片质量评价。 We will release Q-Bench in more languages in the futu…☆24Jan 7, 2024Updated 2 years ago
- SVIT: Scaling up Visual Instruction Tuning☆166Jun 20, 2024Updated last year
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆46Nov 17, 2023Updated 2 years ago
- concreteness ratings list☆25Feb 22, 2017Updated 8 years ago
- Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]☆24Aug 13, 2024Updated last year
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain☆106Mar 14, 2024Updated last year
- This repository contains the dataset, codebase, and benchmarks for our paper: <CNVid-3.5M: Build, Filter, and Pre-train the Large-scale P…☆25Nov 28, 2023Updated 2 years ago
- Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors (ACL 2023)☆28Mar 26, 2024Updated last year
- ☆25May 13, 2024Updated last year
- VisualGPTScore for visio-linguistic reasoning☆27Oct 7, 2023Updated 2 years ago
- ☆32Feb 8, 2024Updated 2 years ago
- [ECCV2024] Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajector…☆30Nov 15, 2025Updated 3 months ago
- [NeurIPS 2025] VeriThinker: Learning to Verify Makes Reasoning Model Efficient☆64Sep 27, 2025Updated 4 months ago
- The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.☆335Sep 14, 2025Updated 5 months ago
- BeHonest: Benchmarking Honesty in Large Language Models☆34Aug 15, 2024Updated last year
- [ICLR 2026] Follow-Your-Shape: This repo is the official implementation of "Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-…☆59Jan 30, 2026Updated 2 weeks ago
- [CVPR 2024] Official PyTorch implementation of the paper "One For All: Video Conversation is Feasible Without Video Instruction Tuning"☆35Feb 2, 2024Updated 2 years ago
- ☆11Mar 11, 2024Updated last year
- This repo contains the code to reproduce figures in my dissertation "Passive Imaging and Characterization of the Subsurface With Distribu…☆10Jun 14, 2018Updated 7 years ago
- [ECCV 2024] BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models☆86Aug 19, 2024Updated last year
- Dataset pruning for ImageNet and LAION-2B.☆79Jul 5, 2024Updated last year
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆45Jun 14, 2024Updated last year
- [SCIS] MULTI-Benchmark: Multimodal Understanding Leaderboard with Text and Images☆44Nov 19, 2025Updated 2 months ago
- Data and code for the ICLR 2023 paper "Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning".☆165Dec 27, 2023Updated 2 years ago
- ☆12Jan 11, 2026Updated last month
- Tensorflow implementation of the paper "Fast Compressive Sensing Using Generative Model with Structed Latent Variables"☆10Apr 7, 2020Updated 5 years ago
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- [ACM MM 2024 (Oral)] Official PyTorch Implementation of Paper "MovingColor: Seamless Fusion of Fine-grained Video Color Enhancement"☆11Dec 30, 2024Updated last year