code for A Large-scale Dataset for Audio-Language Representation Learning
☆14Sep 18, 2024Updated last year
Alternatives and similar repositories for Auto-ACD
Users that are interested in Auto-ACD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code implementation of RP3D-Diag☆17Nov 25, 2024Updated last year
- The official codes for "M^3Builder: A Multi-Agent System for Automated Machine Learning in Medical Imaging"☆43Jul 28, 2025Updated 9 months ago
- The official codes for "AutoRG-Brain: Grounded Report Generation for Brain MRI".☆55Jan 6, 2026Updated 3 months ago
- Official Repository of IJCAI 2024 Paper: "BATON: Aligning Text-to-Audio Model with Human Preference Feedback"☆32Mar 4, 2025Updated last year
- A simple and flexible PyTorch implementation of Video StableDiffusion (ZeroScope_v2) based on diffusers.☆20Feb 15, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [ECCV 2024 Oral] Knowledge-enhanced pretraining for computational pathology☆49Apr 17, 2026Updated 2 weeks ago
- ☆19May 19, 2024Updated last year
- Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)☆12Jun 1, 2023Updated 2 years ago
- [Cancer Cell, 2026] The official codes for "A Knowledge-enhanced Pathology Vision-language Foundation Model for Cancer Diagnosis"☆57Apr 17, 2026Updated 2 weeks ago
- ☆14Jul 1, 2024Updated last year
- [AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos☆35May 27, 2025Updated 11 months ago
- [EMNLP 2024] RaTEScore: A Metric for Radiology Report Generation☆65May 18, 2025Updated 11 months ago
- ☆15Jun 15, 2022Updated 3 years ago
- ☆28Jul 18, 2025Updated 9 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [ICML'24] Creative Text-to-Audio Generation via Synthesizer Programming☆39Sep 26, 2024Updated last year
- [EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.☆82Jan 19, 2026Updated 3 months ago
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆20Sep 1, 2023Updated 2 years ago
- Source code for the paper 'Audio Captioning Transformer'☆56Jan 18, 2022Updated 4 years ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆118Jan 28, 2026Updated 3 months ago
- ATM-Bench: A benchmark for long-term personalized memory QA spanning ~4 years of multimodal data (images, videos, emails). Features refer…☆39Apr 10, 2026Updated 3 weeks ago
- ☆53Mar 24, 2026Updated last month
- Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"☆38Oct 11, 2024Updated last year
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆47Sep 19, 2025Updated 7 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated 2 years ago
- Runtime repository for the SNOMED CT Entity Linking challenge on DrivenData☆14Mar 5, 2024Updated 2 years ago
- The official repository of MM-R5☆29Jun 22, 2025Updated 10 months ago
- [Nature Communications, 2026] The official code for "Boosting Pathology Foundation Models via Few-shot Prompt-tuning for Rare Cancer Subt…☆22Apr 14, 2026Updated 2 weeks ago
- [ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes☆71Oct 8, 2025Updated 6 months ago
- 李宏毅机器学习2021笔记☆14Nov 27, 2022Updated 3 years ago
- [CVPR 2023] iQuery: Instruments as Queries for Audio-Visual Sound Separation☆72Jul 25, 2023Updated 2 years ago
- This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities☆40Updated this week
- My personal solutions to some textbook problems☆11Feb 12, 2020Updated 6 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆10Aug 20, 2023Updated 2 years ago
- PyTorch Implementation of [AudioLCM]: a efficient and high-quality text-to-audio generation with latent consistency model.☆13Jun 15, 2024Updated last year
- Music production for silent film clips.☆32Apr 30, 2025Updated last year
- Sound Separation, Omni modal☆28Sep 15, 2025Updated 7 months ago
- [ICLR 2026] An official implementation of "STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence"☆41Apr 19, 2026Updated last week
- Official PyTorch code of GroundVQA (CVPR'24)☆64Sep 13, 2024Updated last year
- Material for the course of "Mathematics of Transformer"☆22Aug 3, 2025Updated 8 months ago