code for A Large-scale Dataset for Audio-Language Representation Learning
☆14Sep 18, 2024Updated last year
Alternatives and similar repositories for Auto-ACD
Users that are interested in Auto-ACD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The official repository for "One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text Prompts"☆10Aug 16, 2024Updated last year
- The official codes for "M^3Builder: A Multi-Agent System for Automated Machine Learning in Medical Imaging"☆45Jul 28, 2025Updated 10 months ago
- A simple and flexible PyTorch implementation of Video StableDiffusion (ZeroScope_v2) based on diffusers.☆20Feb 15, 2024Updated 2 years ago
- [ECCV 2024 Oral] Knowledge-enhanced pretraining for computational pathology☆50Apr 17, 2026Updated last month
- ☆19May 19, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- The official codes for "Can Modern LLMs Act as Agent Cores in Radiology Environments?"☆29Jan 22, 2025Updated last year
- Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)☆12Jun 1, 2023Updated 3 years ago
- [Cancer Cell, 2026] The official codes for "A Knowledge-enhanced Pathology Vision-language Foundation Model for Cancer Diagnosis"☆61Apr 17, 2026Updated last month
- ☆14Jul 1, 2024Updated last year
- [AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos☆37May 27, 2025Updated last year
- ☆15Jun 15, 2022Updated 3 years ago
- ☆28Jul 18, 2025Updated 10 months ago
- [ICML'24] Creative Text-to-Audio Generation via Synthesizer Programming☆40Sep 26, 2024Updated last year
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆20Sep 1, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Source code for the paper 'Audio Captioning Transformer'☆56Jan 18, 2022Updated 4 years ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆118Jan 28, 2026Updated 4 months ago
- ☆53Mar 24, 2026Updated 2 months ago
- Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"☆38Oct 11, 2024Updated last year
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆47Sep 19, 2025Updated 8 months ago
- [Nature Communications, 2026] The official code for "Boosting Pathology Foundation Models via Few-shot Prompt-tuning for Rare Cancer Subt…☆27Apr 14, 2026Updated last month
- [ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes☆76Oct 8, 2025Updated 8 months ago
- ☆52Sep 10, 2024Updated last year
- [CVPR 2023] iQuery: Instruments as Queries for Audio-Visual Sound Separation☆72Jul 25, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- My personal solutions to some textbook problems☆11Feb 12, 2020Updated 6 years ago
- This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities☆42Apr 28, 2026Updated last month
- ☆10Aug 20, 2023Updated 2 years ago
- Sound Separation, Omni modal☆28Sep 15, 2025Updated 8 months ago
- TensorFlow implementation of the Dissimilarity Mixture Autoencoder: https://arxiv.org/abs/2006.08177☆13Dec 8, 2022Updated 3 years ago
- ☆13Sep 12, 2024Updated last year
- [ICLR 2026] An official implementation of "STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence"☆42Apr 19, 2026Updated last month
- Official PyTorch code of GroundVQA (CVPR'24)☆63Sep 13, 2024Updated last year
- ☆14Sep 4, 2020Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [BMVC 2023] Zero-shot Composed Text-Image Retrieval☆55Nov 26, 2024Updated last year
- Multidimensional Dictionary Learning☆10Sep 27, 2017Updated 8 years ago
- Repository for the Introduction to Machine Learning and Deep Learning course as part of the International Graduate Summer School in Mathe…☆11Aug 8, 2019Updated 6 years ago
- This package aims at simplifying the download of the AudioCaps dataset.☆35Dec 1, 2023Updated 2 years ago
- Ego4DSounds: A diverse egocentric dataset with high action-audio correspondence☆21Jun 14, 2024Updated last year
- This is the official implementation of RL-Chord (TNNLS).☆13Jan 2, 2024Updated 2 years ago
- small audio language model for reasoning☆86Dec 4, 2025Updated 6 months ago