The first comprehensive multimodal language analysis benchmark for evaluating foundation models
☆29Sep 22, 2025Updated 6 months ago
Alternatives and similar repositories for MMLA
Users that are interested in MMLA are comparing it to the libraries listed below
Sorting:
- Unsupervised Multimodal Clustering for Semantics Discovery in Multimodal Utterances (ACL 2024)☆30Dec 7, 2024Updated last year
- On Path to Multimodal Generalist: General-Level and General-Bench☆18Jul 11, 2025Updated 8 months ago
- ☆15May 30, 2025Updated 9 months ago
- [ACM MM2024] The code for HMLLM.☆11Oct 27, 2024Updated last year
- Koishi's Day 2024 Paper (NeurIPS 2024): An advanced persona-driven role-playing system with global faithfulness quantification and optimi…☆11Oct 19, 2025Updated 5 months ago
- ☆16Nov 11, 2025Updated 4 months ago
- Paper List for Dialogue and Interactive Systems☆15Jun 5, 2020Updated 5 years ago
- [ICCV 2025] Official Implementation of "Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation". Junyu Xie, Tengda H…☆21Jul 26, 2025Updated 7 months ago
- TEXTOIR is the first opensource toolkit for text open intent recognition. (ACL 2021)☆243Nov 26, 2025Updated 3 months ago
- [Preprint] Efficient Generative Model Training via Embedded Representation Warmup☆36Oct 15, 2025Updated 5 months ago
- DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging☆47Apr 27, 2025Updated 10 months ago
- ☆24Dec 23, 2024Updated last year
- XL-VLMs: General Repository for eXplainable Large Vision Language Models☆47Sep 8, 2025Updated 6 months ago
- [NeurIPS 2025] Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO☆81Oct 29, 2025Updated 4 months ago
- [CVPR 2025 GMCV] Test-Time Frequency Scaling: Instant Frequency Control for Any Diffusion Model☆55May 31, 2025Updated 9 months ago
- [ICLR 25] A novel framework for building intrinsically interpretable LLMs with human-understandable concepts to ensure safety, reliabilit…☆31Feb 5, 2026Updated last month
- [AAAI 2025] Empowering LLMs with Pseudo-Untrimmed Videos for Audio-Visual Temporal Understanding☆34Mar 21, 2025Updated last year
- ☆35Mar 24, 2025Updated 11 months ago
- ☆27Apr 29, 2025Updated 10 months ago
- This is the official repository for paper: "Human Simulacra: Benchmarking the Personification of Large Language Models" [ICLR 2025]☆30Feb 10, 2025Updated last year
- Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent w…☆100Sep 8, 2025Updated 6 months ago
- Deep Unknown Intent Detection with Margin Loss (ACL2019)☆35Dec 8, 2022Updated 3 years ago
- ☆12Jan 26, 2023Updated 3 years ago
- ☆99Jun 23, 2025Updated 8 months ago
- [NeurIPS 2025] IEAP: Image Editing As Programs with Diffusion Models☆113Sep 27, 2025Updated 5 months ago
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆37Aug 29, 2023Updated 2 years ago
- [AAAI 2024] DTF-AT: Decoupled Time-Frequency Audio Transformer for Event Classification☆12Mar 10, 2025Updated last year
- Implementation of Qformer from BLIP2 in Zeta Lego blocks.☆49Nov 11, 2024Updated last year
- The implementation for FREE-Merging: Fourier Transform for Model Merging with Lightweight Experts (ICCV25)☆14Jun 26, 2025Updated 8 months ago
- ☆15Feb 18, 2024Updated 2 years ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Jun 1, 2025Updated 9 months ago
- A reading list for research topics in multimodal deception detection.☆44Aug 29, 2023Updated 2 years ago
- code for our EMNLP 2017 paper "DOC: Deep Open Classification of Text Documents"☆30Apr 18, 2019Updated 6 years ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- LIMI: Less is More for Agency☆160Oct 14, 2025Updated 5 months ago
- An Interactive Introduction to Model-Agnostic Meta-Learning☆10Sep 18, 2022Updated 3 years ago
- [Official Implementation] Improving Editability in Image Generation with Layer-wise Memory, CVPR 2025☆37Mar 2, 2026Updated 2 weeks ago
- Official code of paper "Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models"☆87May 27, 2025Updated 9 months ago
- Official implementation for BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation☆125Mar 2, 2026Updated 2 weeks ago