UCSB-AI/MMWorld

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/UCSB-AI/MMWorld)

UCSB-AI / MMWorld

Official repo of the ICLR 2025 paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"

☆28

Alternatives and similar repositories for MMWorld

Users that are interested in MMWorld are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

UCSB-AI / SafeKey
View on GitHub
[EMNLP 2025] Official code for the paper "SafeKey: Amplifying Aha-Moment Insights for Safety Reasoning"
☆16May 12, 2026Updated 2 months ago
weixi-feng / TC-Bench
View on GitHub
☆27Jun 22, 2024Updated 2 years ago
UCSB-AI / FedVLN
View on GitHub
[ECCV 2022] Official pytorch implementation of the paper "FedVLN: Privacy-preserving Federated Vision-and-Language Navigation"
☆14Oct 8, 2022Updated 3 years ago
UCSB-AI / Mitigate-Gender-Bias-in-Image-Search
View on GitHub
Code for the EMNLP 2021 Oral paper "Are Gender-Neutral Queries Really Gender-Neutral? Mitigating Gender Bias in Image Search" https://arx…
☆12Feb 6, 2023Updated 3 years ago
Adamdad / Filter-Gradient-Decent
View on GitHub
In this paper, we propose Filter Gradient Decent (FGD), an efficient stochastic optimization algorithm that makes a consistent estimation…
☆12May 18, 2021Updated 5 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
UCSB-AI / MSSBench
View on GitHub
[ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"
☆37Jun 23, 2025Updated last year
marinero4972 / CyberV
View on GitHub
☆20Jun 10, 2025Updated last year
UCSB-AI / Discffusion
View on GitHub
Official repo for the TMLR paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"
☆29Apr 27, 2024Updated 2 years ago
yulonghui / MOCA
View on GitHub
Official implementation of "Continual Learning by Modeling Intra-Class Variation" (MOCA). [TMLR 2023]
☆16Mar 3, 2023Updated 3 years ago
DAMO-NLP-SG / LongPO
View on GitHub
[ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
☆43Feb 27, 2025Updated last year
yaolinli / TimeChat-Captioner
View on GitHub
[ICML 2026] Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions
☆49Jun 29, 2026Updated last month
UCSB-AI / ComCLIP
View on GitHub
Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"
☆37Aug 18, 2024Updated last year
ajd12342 / why-winoground-hard
View on GitHub
Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022
☆31May 29, 2023Updated 3 years ago
DAMO-NLP-SG / CMM
View on GitHub
✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
☆54Jul 11, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
zai-org / LVBench
View on GitHub
[ICCV 2025] LVBench: An Extreme Long Video Understanding Benchmark
☆145Jul 9, 2025Updated last year
princetonvisualai / merv
View on GitHub
Unifying Specialized Visual Encoders for Video Language Models
☆25Nov 22, 2025Updated 8 months ago
yu-rp / NeuralLineage
View on GitHub
Code for CVPR 2024 Oral "Neural Lineage"
☆17Jun 18, 2024Updated 2 years ago
Han-Zongbo / Skip-n
View on GitHub
This repository contains the code of our paper 'Skip \n: A simple method to reduce hallucination in Large Vision-Language Models'.
☆15Feb 12, 2024Updated 2 years ago
chili-lab / SPORTU
View on GitHub
[ICLR 2025] SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models
☆20Sep 17, 2025Updated 10 months ago
yuhui-zh15 / AutoConverter
View on GitHub
Official implementation of "Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation" (CVPR 202…
☆40May 26, 2025Updated last year
patrick-tssn / VideoHallucer
View on GitHub
VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)
☆43Dec 16, 2025Updated 7 months ago
WPR001 / UGC_VideoCaptioner
View on GitHub
☆16Jun 23, 2026Updated last month
UCSB-AI / CPL
View on GitHub
Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"
☆35Dec 5, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Cholianlll / Quant_Crypto
View on GitHub
☆18Oct 17, 2025Updated 9 months ago
zhiheLu / Ensemble_VLM
View on GitHub
Official code for paper "Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models, ICML2024"
☆28Feb 2, 2025Updated last year
PolyU-ChenLab / ETBench
View on GitHub
👾 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding (NeurIPS 2024)
☆74Jan 20, 2025Updated last year
UCSB-AI / Aerial-Vision-and-Dialog-Navigation
View on GitHub
Codebase of ACL 2023 Findings "Aerial Vision-and-Dialog Navigation"
☆69May 12, 2026Updated 2 months ago
RongKaiWeskerMA / INSTA
View on GitHub
The implementation of Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning
☆13Apr 14, 2024Updated 2 years ago
OpenGVLab / Awesome-LLM4Tool
View on GitHub
A curated list of the papers, repositories, tutorials, and anythings related to the large language models for tools
☆68Aug 22, 2023Updated 2 years ago
yale-nlp / MMVU
View on GitHub
Data and Code for CVPR 2025 paper "MMVU: Measuring Expert-Level Multi-Discipline Video Understanding"
☆76Feb 28, 2025Updated last year
kaistAI / Volcano
View on GitHub
[NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…
☆49Aug 21, 2024Updated last year
meera1hahn / NRNS
View on GitHub
Official implementation of the NRNS paper
☆37Jun 13, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
dongyh20 / Insight-V
View on GitHub
[CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
☆240Nov 7, 2025Updated 8 months ago
xu1998hz / SEScore
View on GitHub
This repo contains all the codes for SEScore implementation
☆15Mar 3, 2025Updated last year
thunlp / KARL
View on GitHub
KARL: Knowledge-Aware Reasoning and Reinforcement Learning for Knowledge-Intensive Visual Grounding
☆68Apr 5, 2026Updated 3 months ago
OpenGVLab / PVC
View on GitHub
[CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
☆54Jun 12, 2025Updated last year
jiyt17 / IDA-VLM
View on GitHub
[ICLR 2025] IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model
☆37Nov 27, 2024Updated last year
manipulate-in-dream / MinD
View on GitHub
☆19Sep 4, 2025Updated 10 months ago
chuntianli666 / CrossVid
View on GitHub
[AAAI 2026] CrossVid: A Comprehensive Benchmark for Evaluating Cross-Video Reasoning in Multimodal Large Language Models
☆23Jul 9, 2026Updated 3 weeks ago