code for A Large-scale Dataset for Audio-Language Representation Learning
☆14Sep 18, 2024Updated last year
Alternatives and similar repositories for Auto-ACD
Users that are interested in Auto-ACD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The official repository for "One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text Prompts"☆10Aug 16, 2024Updated last year
- The official codes for "M^3Builder: A Multi-Agent System for Automated Machine Learning in Medical Imaging"☆38Jul 28, 2025Updated 8 months ago
- Official Repository of IJCAI 2024 Paper: "BATON: Aligning Text-to-Audio Model with Human Preference Feedback"☆32Mar 4, 2025Updated last year
- A simple and flexible PyTorch implementation of Video StableDiffusion (ZeroScope_v2) based on diffusers.☆20Feb 15, 2024Updated 2 years ago
- [ECCV 2024 Oral] Knowledge-enhanced pretraining for computational pathology☆47Oct 1, 2025Updated 6 months ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆19May 19, 2024Updated last year
- The official codes for "Can Modern LLMs Act as Agent Cores in Radiology Environments?"☆28Jan 22, 2025Updated last year
- Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)☆12Jun 1, 2023Updated 2 years ago
- [Cancer Cell] The official codes for "A Knowledge-enhanced Pathology Vision-language Foundation Model for Cancer Diagnosis"☆49Mar 2, 2026Updated last month
- [AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos☆34May 27, 2025Updated 10 months ago
- ☆15Jun 15, 2022Updated 3 years ago
- [ICML'24] Creative Text-to-Audio Generation via Synthesizer Programming☆39Sep 26, 2024Updated last year
- [EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.☆81Jan 19, 2026Updated 2 months ago
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆20Sep 1, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆115Jan 28, 2026Updated 2 months ago
- ☆52Mar 24, 2026Updated 2 weeks ago
- Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"☆38Oct 11, 2024Updated last year
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆47Sep 19, 2025Updated 6 months ago
- The official repository of MM-R5☆28Jun 22, 2025Updated 9 months ago
- Runtime repository for the SNOMED CT Entity Linking challenge on DrivenData☆14Mar 5, 2024Updated 2 years ago
- The official code for "Boosting Pathology Foundation Models via Few-shot Prompt-tuning for Rare Cancer Subtyping"☆20Mar 27, 2026Updated 2 weeks ago
- ☆52Sep 10, 2024Updated last year
- 李宏毅机器学习2021笔记☆14Nov 27, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [CVPR 2023] iQuery: Instruments as Queries for Audio-Visual Sound Separation☆72Jul 25, 2023Updated 2 years ago
- My personal solutions to some textbook problems☆11Feb 12, 2020Updated 6 years ago
- This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities☆40Mar 25, 2026Updated 2 weeks ago
- QRHead: Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking☆38Jan 20, 2026Updated 2 months ago
- ☆10Aug 20, 2023Updated 2 years ago
- PyTorch Implementation of [AudioLCM]: a efficient and high-quality text-to-audio generation with latent consistency model.☆13Jun 15, 2024Updated last year
- Sound Separation, Omni modal☆28Sep 15, 2025Updated 6 months ago
- Music production for silent film clips.☆32Apr 30, 2025Updated 11 months ago
- TensorFlow implementation of the Dissimilarity Mixture Autoencoder: https://arxiv.org/abs/2006.08177☆13Dec 8, 2022Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆13Sep 12, 2024Updated last year
- [ICLR 2026] An official implementation of "STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence"☆41Jan 17, 2026Updated 2 months ago
- Official PyTorch code of GroundVQA (CVPR'24)☆64Sep 13, 2024Updated last year
- A Comprehensive Rare Disease Diagnostic Dataset with nearly 50,000 patients covering more than 4000 diseases☆41Mar 13, 2026Updated 3 weeks ago
- Material for the course of "Mathematics of Transformer"☆21Aug 3, 2025Updated 8 months ago
- ☆14Sep 4, 2020Updated 5 years ago
- [CVPR 2025] Docopilot: Improving Multimodal Models for Document-Level Understanding☆36Jul 22, 2025Updated 8 months ago