Scripts for download AudioSet
☆87Nov 7, 2017Updated 8 years ago
Alternatives and similar repositories for audiosetdl
Users that are interested in audiosetdl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- VGGSound: A Large-scale Audio-Visual Dataset☆357Sep 13, 2021Updated 4 years ago
- ☆43Feb 21, 2023Updated 3 years ago
- Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)☆12Jun 1, 2023Updated 2 years ago
- An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.☆10Feb 22, 2022Updated 4 years ago
- This package aims at simplifying the download of the AudioCaps dataset.☆36Dec 1, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆62Jun 15, 2025Updated 10 months ago
- Official Repository for "Audio-Visual Spatial Integration and Recursive Attention for Robust Sound Source Localization" (ACM MM 2023)☆18Nov 14, 2023Updated 2 years ago
- A dataset for Audio-Visual Sound Event Detection in Movies☆26Jan 23, 2023Updated 3 years ago
- download the vggsound dataset☆22Feb 22, 2022Updated 4 years ago
- Localizing Visual Sounds the Hard Way☆83Jul 6, 2022Updated 3 years ago
- This repo contains the code to reproduce the paper: "Enriched Music Representations with Multiple Cross-modal Contrastive Learning"☆15Jun 22, 2023Updated 2 years ago
- Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)☆56Jan 29, 2024Updated 2 years ago
- Demo page of TAVGBench: Benchmarking Text to Audible-Video Generation☆15Apr 7, 2025Updated last year
- The repo host the code and model of MAViL.☆45Jul 24, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).☆106Aug 1, 2023Updated 2 years ago
- The code repo for ICASSP 2023 Paper "MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning"☆26May 18, 2023Updated 2 years ago
- Download audioset data super fastly with youtube-dl, ffmpeg and python multiprocessing☆48Aug 1, 2024Updated last year
- This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.☆258Jul 25, 2024Updated last year
- Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".☆290Mar 20, 2024Updated 2 years ago
- This repo hosts the code and models of "Masked Autoencoders that Listen".☆657Apr 5, 2024Updated 2 years ago
- Official implementation for AVGN☆41Mar 24, 2023Updated 3 years ago
- [ICCV 2023] Video Background Music Generation: Dataset, Method and Evaluation☆78Mar 29, 2024Updated 2 years ago
- ☆11Sep 1, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- The repo for "Class-aware Sounding Objects Localization", TPAMI 2021.☆29Mar 4, 2022Updated 4 years ago
- Official Code of ICCV 2021 Paper: Learning to Cut by Watching Movies☆51Nov 9, 2022Updated 3 years ago
- 🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps☆208Oct 6, 2025Updated 6 months ago
- ☆12Aug 25, 2023Updated 2 years ago
- Toolkit for downloading and processing Google's AudioSet dataset.☆180Aug 22, 2025Updated 7 months ago
- [CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners☆155Jul 6, 2024Updated last year
- Official PyTorch implementation of the TIP paper "Generating Visually Aligned Sound from Videos" and the corresponding Visually Aligned S…☆54Dec 15, 2020Updated 5 years ago
- SongDriver2 achieves a balance between real-time emotion fit and soft transitions, enhancing the coherence of the generated music.☆11Nov 15, 2025Updated 5 months ago
- Audio Dataset for training CLAP and other models☆734Jan 8, 2026Updated 3 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"☆485Sep 18, 2025Updated 7 months ago
- experiments about AudioSet☆43Jul 22, 2023Updated 2 years ago
- MUSIC Dataset from The Sound of Pixels (ECCV '18)☆136Aug 12, 2022Updated 3 years ago
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆47Sep 19, 2025Updated 7 months ago
- Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)☆85Dec 3, 2024Updated last year
- Enhanced sound event localization and detection in real 360-degree audio-visual soundscapes (DCASE task3 format)☆14Mar 21, 2025Updated last year
- Official implementation of the pipeline presented in I hear your true colors: Image Guided Audio Generation☆125Jan 18, 2023Updated 3 years ago