[ICLR'25] Official repository for "AVHBench: A Cross-Modal Hallucination Evaluation for Audio-Visual Large Language Models"
☆20Mar 8, 2026Updated 2 weeks ago
Alternatives and similar repositories for AVHBench
Users that are interested in AVHBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"☆31Dec 23, 2024Updated last year
- YOLOv8安全帽工作服检测☆12Oct 13, 2023Updated 2 years ago
- Reddit Crawler API for collecting datasets from Reddit.☆11Dec 31, 2022Updated 3 years ago
- python实现微博热点事件舆情分析(爬虫)☆12May 5, 2022Updated 3 years ago
- ☆31Jun 19, 2025Updated 9 months ago
- Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation☆62Jun 26, 2025Updated 8 months ago
- ☆16Sep 29, 2025Updated 5 months ago
- Code repository for GCT634 Musical Applications of Machine Learning (Spring 2024)☆11May 19, 2024Updated last year
- ☆34Mar 16, 2026Updated last week
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆12Feb 27, 2024Updated 2 years ago
- [ICLR 2026] Data Pipeline, Models, and Benchmark for Omni-Captioner.☆124Updated this week
- ☆17Nov 29, 2024Updated last year
- [ICCV 2021] Multimodal Knowledge Expansion☆10Aug 28, 2021Updated 4 years ago
- Repository of the WACV'24 paper "Can CLIP Help Sound Source Localization?"☆34Feb 21, 2025Updated last year
- Tools for the evaluation of audio captioning.☆19May 23, 2020Updated 5 years ago
- Official Implementation of Avoiding spurious correlations via logit correction☆17May 6, 2023Updated 2 years ago
- ☆57Aug 16, 2025Updated 7 months ago
- Official code for "A Closer Look at Audio-Visual Segmentation"☆95Oct 31, 2025Updated 4 months ago
- Official code of ElasticAST (Interspeech 2024 paper)☆34Jul 30, 2024Updated last year
- ☆76Feb 26, 2026Updated 3 weeks ago
- ☆68Dec 30, 2025Updated 2 months ago
- Code and dataset release for "PACS: A Dataset for Physical Audiovisual CommonSense Reasoning" (ECCV 2022)☆17Dec 20, 2022Updated 3 years ago
- ☆23Aug 26, 2023Updated 2 years ago
- Official code for SongEcho☆52Mar 3, 2026Updated 3 weeks ago
- The official repository of "R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Integration"☆139Sep 4, 2025Updated 6 months ago
- ☆18Feb 5, 2026Updated last month
- Official code for CVPR 2024 paper, "Audio-Visual Segmentation via Unlabeled Frame Exploitation""☆18Jul 7, 2024Updated last year
- CVE-Factory☆65Feb 13, 2026Updated last month
- Pythonic file-system interface for TOS(Tinder Object Storage)https://tosfs.readthedocs.io/en/latest/☆17Mar 10, 2026Updated 2 weeks ago
- Code for "NVUM: Non-volatile Unbiased Memory for Robust Medical Classification" [MICCAI 2022 Early Accept]☆12Sep 6, 2022Updated 3 years ago
- [SIGGRAPH21] DeepFormableTag: End-to-end Generation and Recognition of Deformable Fiducial Markers☆49Jun 8, 2022Updated 3 years ago
- CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning☆35Aug 28, 2025Updated 6 months ago
- Open SingSong - Implementation of 'SingSong: Generating Musical Accompaniments from Singing' by Google Research, with a few modifications☆16Jun 10, 2024Updated last year
- Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes☆98Dec 4, 2024Updated last year
- https://guanyingc.github.io/DeepHDRVideo/☆15Sep 27, 2021Updated 4 years ago
- ☆40Apr 14, 2025Updated 11 months ago
- 关于ER-X汉化测试☆10Mar 8, 2021Updated 5 years ago
- Automatically exported from code.google.com/p/baposter☆16Dec 15, 2015Updated 10 years ago
- Solos: A Dataset for Audio-Visual Music Analysis☆24Feb 17, 2023Updated 3 years ago