[ICLR'25] Official repository for "AVHBench: A Cross-Modal Hallucination Evaluation for Audio-Visual Large Language Models"
☆24Mar 8, 2026Updated 2 months ago
Alternatives and similar repositories for AVHBench
Users that are interested in AVHBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"☆31Dec 23, 2024Updated last year
- Music Language Model Generation, Optimization, and Practice☆57Apr 20, 2026Updated last month
- [CVPR 2026] Fine-Grained GRPO for Precise Preference Alignment in Flow Models☆57Mar 26, 2026Updated last month
- Source code of attributed graph generator☆11Feb 10, 2023Updated 3 years ago
- Reddit Crawler API for collecting datasets from Reddit.☆11Dec 31, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- YOLOv8安全帽工作服检测☆12Oct 13, 2023Updated 2 years ago
- The official repository TimeAudio, a comprehensive framework that incorporates fine-grained acoustic cues into LALMs with enhanced module…☆28Nov 18, 2025Updated 6 months ago
- ControlFoley: Unified and Controllable Video-to-Audio Generation with Cross-Modal Conflict Handling☆88Apr 22, 2026Updated last month
- python实现微博热点事件舆情分析(爬虫)☆12May 5, 2022Updated 4 years ago
- ☆30Jun 19, 2025Updated 11 months ago
- ☆16Sep 29, 2025Updated 7 months ago
- Code repository for GCT634 Musical Applications of Machine Learning (Spring 2024)☆11May 19, 2024Updated 2 years ago
- Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation☆62Jun 26, 2025Updated 10 months ago
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆12Feb 27, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos☆36May 27, 2025Updated 11 months ago
- ☆17Nov 29, 2024Updated last year
- [ICLR 2026] Data Pipeline, Models, and Benchmark for Omni-Captioner.☆134Apr 7, 2026Updated last month
- [ICCV 2021] Multimodal Knowledge Expansion☆10Aug 28, 2021Updated 4 years ago
- Repository of the IJCV'26 & WACV'24 paper☆34Apr 27, 2026Updated 3 weeks ago
- Tools for the evaluation of audio captioning.☆19May 23, 2020Updated 6 years ago
- Official Implementation of Avoiding spurious correlations via logit correction☆17May 6, 2023Updated 3 years ago
- ☆48Mar 16, 2026Updated 2 months ago
- ☆57Aug 16, 2025Updated 9 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official code for "A Closer Look at Audio-Visual Segmentation"☆97Oct 31, 2025Updated 6 months ago
- Official code of ElasticAST (Interspeech 2024 paper)☆34Jul 30, 2024Updated last year
- ☆23Aug 26, 2023Updated 2 years ago
- Official code for SongEcho☆63Mar 3, 2026Updated 2 months ago
- Code and dataset release for "PACS: A Dataset for Physical Audiovisual CommonSense Reasoning" (ECCV 2022)☆18Dec 20, 2022Updated 3 years ago
- ☆19Feb 5, 2026Updated 3 months ago
- ☆71Dec 30, 2025Updated 4 months ago
- The official repository of "R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Integration"☆140Sep 4, 2025Updated 8 months ago
- Pythonic file-system interface for TOS(Tinder Object Storage)https://tosfs.readthedocs.io/en/latest/☆17Mar 27, 2026Updated last month
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Code for "NVUM: Non-volatile Unbiased Memory for Robust Medical Classification" [MICCAI 2022 Early Accept]☆12Sep 6, 2022Updated 3 years ago
- Official code for CVPR 2024 paper, "Audio-Visual Segmentation via Unlabeled Frame Exploitation""☆19Jul 7, 2024Updated last year
- [SIGGRAPH21] DeepFormableTag: End-to-end Generation and Recognition of Deformable Fiducial Markers☆49Jun 8, 2022Updated 3 years ago
- ☆86Apr 8, 2026Updated last month
- CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning☆36Aug 28, 2025Updated 8 months ago
- Open SingSong - Implementation of 'SingSong: Generating Musical Accompaniments from Singing' by Google Research, with a few modifications☆16Jun 10, 2024Updated last year
- Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes☆102Dec 4, 2024Updated last year