thuiar / MMLALinks
The first comprehensive multimodal language analysis benchmark for evaluating foundation models
☆21Updated 2 months ago
Alternatives and similar repositories for MMLA
Users that are interested in MMLA are comparing it to the libraries listed below
Sorting:
- ☆55Updated last year
- OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Rea…☆98Updated 2 months ago
- MIntRec2.0 is the first large-scale dataset for multimodal intent recognition and out-of-scope detection in multi-party conversations (IC…☆58Updated last month
- (ICLR'25) A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents☆84Updated 7 months ago
- Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"☆33Updated last year
- Multimodal Empathetic Chatbot☆45Updated last year
- ☆14Updated 9 months ago
- HumanOmni☆194Updated 6 months ago
- A comprehensive overview of affective computing research in the era of large language models (LLMs).☆26Updated last year
- Synth-Empathy: Towards High-Quality Synthetic Empathy Data☆15Updated 6 months ago
- ☆82Updated last year
- ☆101Updated 2 months ago
- [ACL 2024] A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset☆18Updated 3 months ago
- ☆21Updated 8 months ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆17Updated 5 months ago
- [ACL24] EmoBench: Evaluating the Emotional Intelligence of Large Language Models☆90Updated 4 months ago
- Sparrow: Data-Efficient Video-LLM with Text-to-Image Augmentation☆30Updated 5 months ago
- On Path to Multimodal Generalist: General-Level and General-Bench☆19Updated 2 months ago
- The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search" [EMNLP25]☆28Updated 2 weeks ago
- [ACL 2023] VSTAR is a multimodal dialogue dataset with scene and topic transition information☆15Updated 10 months ago
- The code and data of We-Math, accepted by ACL 2025 main conference.☆135Updated 3 weeks ago
- Official repository of MMDU dataset☆93Updated 11 months ago
- ✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio☆48Updated 2 months ago
- WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs☆29Updated last week
- \infty-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation☆16Updated 7 months ago
- EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning [🔥The Exploration of R1 for General Audio-Vi…☆56Updated 4 months ago
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆17Updated 3 weeks ago
- ☆90Updated last year
- Official PyTorch implementation of EMOVA in CVPR 2025 (https://arxiv.org/abs/2409.18042)☆67Updated 6 months ago
- A project for tri-modal LLM benchmarking and instruction tuning.☆47Updated 5 months ago