DAMO-NLP-SG/CMM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/DAMO-NLP-SG/CMM)

DAMO-NLP-SG / CMM

✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio

☆54

Alternatives and similar repositories for CMM

Users that are interested in CMM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

DAMO-NLP-SG / LongPO
View on GitHub
[ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
☆43Feb 27, 2025Updated last year
DAMO-NLP-SG / Inf-CLIP
View on GitHub
[CVPR 2025 Highlight] The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for C…
☆287Jan 16, 2025Updated last year
DAMO-NLP-SG / VCD
View on GitHub
[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
☆410Oct 7, 2024Updated last year
DAMO-NLP-SG / MT-LLaMA
View on GitHub
Multi-Task instruction-tuned LLaMA
☆14May 5, 2023Updated 3 years ago
clownrat6 / OpenVIS
View on GitHub
[AAAI 2025] Open-vocabulary Video Instance Segmentation Codebase built upon Detectron2, which is really easy to use.
☆26Dec 30, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
alibaba-damo-academy / RynnScale
View on GitHub
RynnScale: Scalable VLM and VLA Development Kits
☆25Jul 17, 2026Updated last week
clownrat6 / VectorNet
View on GitHub
The implementation of VectorNet. Done and Lose
☆41Jun 21, 2020Updated 6 years ago
DAMO-NLP-SG / multimodal_textbook
View on GitHub
[ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"
☆196Mar 17, 2025Updated last year
DAMO-NLP-SG / VideoLLaMA2
View on GitHub
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
☆1,304Jan 23, 2025Updated last year
ErikZ719 / CoTA
View on GitHub
[ICLR 26] Context Tokens are Anchors: Understanding the Repeat Curse in dMLLMs from an Information Flow Perspective
☆16Mar 6, 2026Updated 4 months ago
clownrat6 / Novel_Theft
View on GitHub
轻小说文库 epub 解析打包
☆21May 3, 2020Updated 6 years ago
jpthu17 / DiCoSA
View on GitHub
[IJCAI 2023] Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment
☆53Apr 9, 2024Updated 2 years ago
LengSicong / Tell2Design
View on GitHub
[ACL2023 Area Chair Award] Official repo for the paper "Tell2Design: A Dataset for Language-Guided Floor Plan Generation".
☆85Mar 14, 2025Updated last year
alibaba-damo-academy / PixelRefer
View on GitHub
The code for PixelRefer & VideoRefer
☆352Nov 16, 2025Updated 8 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
rain305f / OSP
View on GitHub
[CVPR 2023] Out-of-Distributed Semantic Pruning for Robust Semi-Supervised Learning
☆22Jun 11, 2023Updated 3 years ago
DAMO-NLP-SG / contrastive-cot
View on GitHub
Contrastive Chain-of-Thought Prompting
☆69Nov 18, 2023Updated 2 years ago
DAMO-NLP-SG / RemeMo
View on GitHub
[EMNLP 2023] Once Upon a *Time* in *Graph*: Relative-Time Pretraining for Complex Temporal Reasoning
☆17Oct 31, 2023Updated 2 years ago
AV-Odyssey / AV-Odyssey
View on GitHub
This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"
☆31Dec 23, 2024Updated last year
jpthu17 / HBI
View on GitHub
[CVPR 2023 Highlight & TPAMI] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
☆125Dec 28, 2024Updated last year
DAMO-NLP-SG / SSTuning
View on GitHub
Code for ACL paper "Zero-Shot Text Classification via Self-Supervised Tuning"
☆29Sep 25, 2023Updated 2 years ago
CASIA-IVA-Lab / VideoNIAH
View on GitHub
VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs
☆57Mar 9, 2025Updated last year
SaFo-Lab / AdaShield
View on GitHub
[ECCV 2024] The official code for "AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shi…
☆73Feb 9, 2026Updated 5 months ago
Zhao-Yian / GraCo
View on GitHub
[CVPR 2024 Highlight] Official GraCo: Granularity-Controllable Interactive Segmentation.
☆61Mar 11, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
yale-nlp / MMVU
View on GitHub
Data and Code for CVPR 2025 paper "MMVU: Measuring Expert-Level Multi-Discipline Video Understanding"
☆76Feb 28, 2025Updated last year
zhangbaijin / From-Redundancy-to-Relevance
View on GitHub
[NAACL 2025 Oral] From redundancy to relevance: Enhancing explainability in multimodal large language models
☆130Jan 30, 2026Updated 5 months ago
Share14 / ShareGemini
View on GitHub
☆32Jul 29, 2024Updated last year
DAMO-NLP-SG / IE-E2H
View on GitHub
Easy-to-Hard Learning for Information Extraction (ACL 2023 Findings)
☆14Jul 11, 2023Updated 3 years ago
SihengLi99 / SEALONG
View on GitHub
Large Language Models Can Self-Improve in Long-context Reasoning
☆72Nov 24, 2024Updated last year
DAMO-NLP-SG / Auto-Arena-LLMs
View on GitHub
☆44Oct 7, 2024Updated last year
jpthu17 / EMCL
View on GitHub
[NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations
☆148Apr 9, 2024Updated 2 years ago
nickjiang2378 / vlm-hallucinations
View on GitHub
[ICLR '25] Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"
☆105Nov 30, 2025Updated 7 months ago
sled-group / moh
View on GitHub
[NeurIPS 2024] Official Repository of Multi-Object Hallucination in Vision-Language Models
☆37Nov 13, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
LengSicong / MMR1
View on GitHub
[CVPR 2026] MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
☆217Sep 26, 2025Updated 9 months ago
jxjessieli / contextual-distortion-parser
View on GitHub
[ACL 2023] Contextual Distortion Reveals Constituency: Mask Language Models are Implicit Parsers.
☆14Jun 3, 2023Updated 3 years ago
patrick-tssn / VideoHallucer
View on GitHub
VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)
☆43Dec 16, 2025Updated 7 months ago
HowardLi1984 / ECDFormer
View on GitHub
【Nature Computational Science 2025🔥】Deep peak property learning for efficient chiral molecules ECD spectra prediction
☆51Jan 12, 2025Updated last year
RUCAIBox / Event-Bench
View on GitHub
Official code of *Towards Event-oriented Long Video Understanding*
☆12Jul 26, 2024Updated last year
eujhwang / vn-analysis
View on GitHub
virtual node analysis on ogb benchmark dataset
☆14Mar 9, 2023Updated 3 years ago
TIGER-AI-Lab / VisualWebInstruct
View on GitHub
The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search" [EMNLP25]
☆39Feb 1, 2026Updated 5 months ago