speedyseal / audiosetdlView external linksLinks
Scripts for download AudioSet
☆86Nov 7, 2017Updated 8 years ago
Alternatives and similar repositories for audiosetdl
Users that are interested in audiosetdl are comparing it to the libraries listed below
Sorting:
- VGGSound: A Large-scale Audio-Visual Dataset☆350Sep 13, 2021Updated 4 years ago
- ☆43Feb 21, 2023Updated 2 years ago
- An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.☆10Feb 22, 2022Updated 3 years ago
- Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)☆12Jun 1, 2023Updated 2 years ago
- This package aims at simplifying the download of the AudioCaps dataset.☆36Dec 1, 2023Updated 2 years ago
- download the vggsound dataset☆22Feb 22, 2022Updated 3 years ago
- Demo page of TAVGBench: Benchmarking Text to Audible-Video Generation☆14Apr 7, 2025Updated 10 months ago
- A dataset for Audio-Visual Sound Event Detection in Movies☆26Jan 23, 2023Updated 3 years ago
- The repo host the code and model of MAViL.☆45Jul 24, 2023Updated 2 years ago
- ☆62Jun 15, 2025Updated 8 months ago
- Download audioset data super fastly with youtube-dl, ffmpeg and python multiprocessing☆45Aug 1, 2024Updated last year
- This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.☆256Jul 25, 2024Updated last year
- Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)☆54Jan 29, 2024Updated 2 years ago
- Localizing Visual Sounds the Hard Way☆82Jul 6, 2022Updated 3 years ago
- Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".☆286Mar 20, 2024Updated last year
- 📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).☆104Aug 1, 2023Updated 2 years ago
- The code repo for ICASSP 2023 Paper "MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning"☆25May 18, 2023Updated 2 years ago
- ☆13Jun 2, 2022Updated 3 years ago
- ☆11Sep 1, 2024Updated last year
- Functions for creating speech features in MATLAB.☆13Jul 7, 2020Updated 5 years ago
- Open, royalty free, lyrics2song / song generation data collection / cleaning pipeline.☆17May 9, 2025Updated 9 months ago
- ☆23Oct 5, 2017Updated 8 years ago
- [ICCV 2023] Video Background Music Generation: Dataset, Method and Evaluation☆78Mar 29, 2024Updated last year
- Official repository of Myna: Masking-Based Contrastive Learning of Musical Representations☆17Mar 31, 2025Updated 10 months ago
- TG-CRITIC: A TIMBRE-GUIDED MODEL FOR REFERENCE-INDEPENDENT SINGING EVALUATION☆15May 26, 2023Updated 2 years ago
- 🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps☆203Oct 6, 2025Updated 4 months ago
- [Interspeech 2024] LiteFocus is a tool designed to accelerate diffusion-based TTA model, now implemented with the base model AudioLDM2.☆34Mar 11, 2025Updated 11 months ago
- This repo hosts the code and models of "Masked Autoencoders that Listen".☆647Apr 5, 2024Updated last year
- Official Repository of IJCAI 2024 Paper: "BATON: Aligning Text-to-Audio Model with Human Preference Feedback"☆32Mar 4, 2025Updated 11 months ago
- Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)☆85Dec 3, 2024Updated last year
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆47Sep 19, 2025Updated 4 months ago
- Download AudioSet for Vision-Audio-Text Pre-training☆13May 16, 2022Updated 3 years ago
- Official PyTorch implementation of the TIP paper "Generating Visually Aligned Sound from Videos" and the corresponding Visually Aligned S…☆54Dec 15, 2020Updated 5 years ago
- Toolkit for downloading and processing Google's AudioSet dataset.☆175Aug 22, 2025Updated 5 months ago
- Audio Dataset for training CLAP and other models☆729Jan 8, 2026Updated last month
- Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP☆356Feb 15, 2022Updated 4 years ago
- ☆12Aug 25, 2023Updated 2 years ago
- The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"☆470Sep 18, 2025Updated 4 months ago
- Official implementation for AVGN☆40Mar 24, 2023Updated 2 years ago