[ICLR'25] Official repository for "AVHBench: A Cross-Modal Hallucination Evaluation for Audio-Visual Large Language Models"
☆20Mar 8, 2026Updated last month
Alternatives and similar repositories for AVHBench
Users that are interested in AVHBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"☆31Dec 23, 2024Updated last year
- [CVPR 2026] Fine-Grained GRPO for Precise Preference Alignment in Flow Models☆54Mar 26, 2026Updated 2 weeks ago
- The official repository TimeAudio, a comprehensive framework that incorporates fine-grained acoustic cues into LALMs with enhanced module…☆26Nov 18, 2025Updated 4 months ago
- YOLOv8安全帽工作服检测☆13Oct 13, 2023Updated 2 years ago
- Reddit Crawler API for collecting datasets from Reddit.☆11Dec 31, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- python实现微博热点事件舆情分析(爬虫)☆12May 5, 2022Updated 3 years ago
- ☆31Jun 19, 2025Updated 9 months ago
- ☆16Sep 29, 2025Updated 6 months ago
- Code repository for GCT634 Musical Applications of Machine Learning (Spring 2024)☆11May 19, 2024Updated last year
- Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation☆62Jun 26, 2025Updated 9 months ago
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆12Feb 27, 2024Updated 2 years ago
- ☆17Nov 29, 2024Updated last year
- [ICLR 2026] Data Pipeline, Models, and Benchmark for Omni-Captioner.☆129Apr 7, 2026Updated last week
- ☆44Mar 16, 2026Updated 3 weeks ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- [ICCV 2021] Multimodal Knowledge Expansion☆10Aug 28, 2021Updated 4 years ago
- Repository of the WACV'24 paper "Can CLIP Help Sound Source Localization?"☆34Feb 21, 2025Updated last year
- Tools for the evaluation of audio captioning.☆19May 23, 2020Updated 5 years ago
- Official Implementation of Avoiding spurious correlations via logit correction☆17May 6, 2023Updated 2 years ago
- ☆57Aug 16, 2025Updated 7 months ago
- Official code for "A Closer Look at Audio-Visual Segmentation"☆96Oct 31, 2025Updated 5 months ago
- Official code of ElasticAST (Interspeech 2024 paper)☆34Jul 30, 2024Updated last year
- ☆68Dec 30, 2025Updated 3 months ago
- Official code for SongEcho☆55Mar 3, 2026Updated last month
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Code and dataset release for "PACS: A Dataset for Physical Audiovisual CommonSense Reasoning" (ECCV 2022)☆17Dec 20, 2022Updated 3 years ago
- ☆23Aug 26, 2023Updated 2 years ago
- The official repository of "R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Integration"☆139Sep 4, 2025Updated 7 months ago
- Official code for CVPR 2024 paper, "Audio-Visual Segmentation via Unlabeled Frame Exploitation""☆18Jul 7, 2024Updated last year
- ☆19Feb 5, 2026Updated 2 months ago
- ☆81Updated this week
- Pythonic file-system interface for TOS(Tinder Object Storage)https://tosfs.readthedocs.io/en/latest/☆17Mar 27, 2026Updated 2 weeks ago
- Code for "NVUM: Non-volatile Unbiased Memory for Robust Medical Classification" [MICCAI 2022 Early Accept]☆12Sep 6, 2022Updated 3 years ago
- [SIGGRAPH21] DeepFormableTag: End-to-end Generation and Recognition of Deformable Fiducial Markers☆49Jun 8, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning☆35Aug 28, 2025Updated 7 months ago
- Open SingSong - Implementation of 'SingSong: Generating Musical Accompaniments from Singing' by Google Research, with a few modifications☆16Jun 10, 2024Updated last year
- Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes☆99Dec 4, 2024Updated last year
- https://guanyingc.github.io/DeepHDRVideo/☆15Sep 27, 2021Updated 4 years ago
- ☆40Apr 14, 2025Updated last year
- 关于ER-X汉化测试☆10Mar 8, 2021Updated 5 years ago
- Automatically exported from code.google.com/p/baposter☆16Dec 15, 2015Updated 10 years ago