thuiar / MMLALinks
The first comprehensive multimodal language analysis benchmark for evaluating foundation models
☆28Updated 4 months ago
Alternatives and similar repositories for MMLA
Users that are interested in MMLA are comparing it to the libraries listed below
Sorting:
- ☆59Updated last year
- (NIPS 2025) OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Align…☆123Updated 2 months ago
- MIntRec2.0 is the first large-scale dataset for multimodal intent recognition and out-of-scope detection in multi-party conversations (IC…☆70Updated 5 months ago
- ☆88Updated last year
- A Self-Training Framework for Vision-Language Reasoning☆88Updated last year
- A Survey on Benchmarks of Multimodal Large Language Models☆146Updated 6 months ago
- ☆135Updated 2 months ago
- [NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆53Updated 3 months ago
- Official repository of MMDU dataset☆102Updated last year
- This repository hosts the code, data and model weight of PanoSent.☆59Updated 6 months ago
- (ICCV2025) Official repository of paper "ViSpeak: Visual Instruction Feedback in Streaming Videos"☆44Updated 6 months ago
- [ICLR 2025] ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability via Chart-to-Code Generation☆130Updated last month
- WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs☆38Updated 2 months ago
- The code and data of We-Math, accepted by ACL 2025 main conference.☆134Updated last month
- ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of…☆74Updated 7 months ago
- ☆21Updated 8 months ago
- A project for tri-modal LLM benchmarking and instruction tuning.☆54Updated 9 months ago
- EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning [🔥The Exploration of R1 for General Audio-Vi…☆70Updated 8 months ago
- ✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio☆53Updated 6 months ago
- LMM solved catastrophic forgetting, AAAI2025☆45Updated 9 months ago
- [ACL24] EmoBench: Evaluating the Emotional Intelligence of Large Language Models☆108Updated 8 months ago
- HumanOmni☆216Updated 10 months ago
- This is for ACL 2025 Findings Paper: From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalitiesModels☆86Updated 3 weeks ago
- Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"☆34Updated last year
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆58Updated last year
- Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …☆90Updated last year
- ☆19Updated 3 months ago
- Code and data for paper "Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation".☆24Updated 3 months ago
- [ACL 2023] VSTAR is a multimodal dialogue dataset with scene and topic transition information☆15Updated last year
- Synth-Empathy: Towards High-Quality Synthetic Empathy Data☆18Updated 10 months ago