☆27Jan 23, 2024Updated 2 years ago
Alternatives and similar repositories for MMCBench
Users that are interested in MMCBench are comparing it to the libraries listed below
Sorting:
- [ICLR2025] Detecting Backdoor Samples in Contrastive Language Image Pretraining☆19Feb 26, 2025Updated last year
- The code for ACM MM2024 (Multimodal Unlearnable Examples: Protecting Data against Multimodal Contrastive Learning)☆15Jul 18, 2024Updated last year
- The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024☆18Oct 11, 2024Updated last year
- [WIP@Oct 13] 质衡-基准测试 (Q-Bench in Chinese),包含中文版【底层视觉问答】和【底层视觉描述】数据集,以及中文提示下的图片质量评价。 We will release Q-Bench in more languages in the futu…☆24Jan 7, 2024Updated 2 years ago
- ☆20Nov 3, 2024Updated last year
- [ECCV 2024] BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models☆86Aug 19, 2024Updated last year
- ☆25Feb 19, 2025Updated last year
- Official Pytorch Implementation of Paper "A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Des…☆55Aug 27, 2025Updated 6 months ago
- [CVPR 2023] Official implementation of our paper - Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learnin…☆27Apr 10, 2023Updated 2 years ago
- [CHIL 2024] Interpretation of Intracardiac Electrograms Through Textual Representations☆12Sep 4, 2024Updated last year
- Official repo for EMNLP 2023 paper "Explain-then-Translate: An Analysis on Improving Program Translation with Self-generated Explanations…☆29Dec 5, 2023Updated 2 years ago
- This is the official repository for the ICLR 2025 accepted paper Badrobot: Manipulating Embodied LLMs in the Physical World.☆41Jun 26, 2025Updated 8 months ago
- Automated Benchmarking of LLM Agents on Real-World Software Security Tasks [NeurIPS 2025]☆56Jan 27, 2026Updated last month
- ☆38Jan 15, 2025Updated last year
- [ICLR 2024] Provable Robust Watermarking for AI-Generated Text☆38Dec 4, 2023Updated 2 years ago
- [ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"☆86Nov 28, 2023Updated 2 years ago
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆45Jun 14, 2024Updated last year
- ☆11Mar 31, 2022Updated 3 years ago
- EmotionCircuits-LLM: A complete, reproducible framework for discovering and controlling emotion circuits in large language models.☆25Oct 20, 2025Updated 4 months ago
- A Swedish Natural Language Understanding Benchmark☆11Dec 12, 2025Updated 2 months ago
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- [NeurIPS'23] Binary Classification with Confidence Difference☆10May 13, 2024Updated last year
- ☆12Jan 11, 2026Updated last month
- A framework for few-shot evaluation of autoregressive language models.☆12Jul 14, 2025Updated 7 months ago
- [EMNLP 2025 Oral] IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents☆16Sep 16, 2025Updated 5 months ago
- ☆10Jun 23, 2018Updated 7 years ago
- [CVPR2024] Learning from Synthetic Human Group Activities☆14Feb 24, 2025Updated last year
- ☆42Jan 8, 2024Updated 2 years ago
- ☆73Jul 30, 2025Updated 7 months ago
- ☆38Jan 4, 2024Updated 2 years ago
- The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024☆50Oct 12, 2025Updated 4 months ago
- Python codes for mathematical modeling.☆12Sep 5, 2021Updated 4 years ago
- ☆12Nov 21, 2023Updated 2 years ago
- Website for release of TellMeWhy dataset for why question answering☆14Nov 11, 2022Updated 3 years ago
- Dataset for AAAI paper "Natural Language Inference in Context - Investigating Contextual Reasoning over Long Texts"☆11Nov 18, 2022Updated 3 years ago
- ☆11Jan 3, 2024Updated 2 years ago
- ☆16Feb 23, 2025Updated last year
- ☆12Mar 5, 2025Updated 11 months ago
- We introduce Chart2Code, the first user-driven, hierarchical benchmark that systematically evaluates Large Multimodal Models on chart-to-…☆24Jan 27, 2026Updated last month