speedyseal/audiosetdl

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/speedyseal/audiosetdl)

speedyseal / audiosetdl

Scripts for download AudioSet

☆89

Alternatives and similar repositories for audiosetdl

Users that are interested in audiosetdl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hche11 / VGGSound
View on GitHub
VGGSound: A Large-scale Audio-Visual Dataset
☆359Sep 13, 2021Updated 4 years ago
stoneMo / OneAVM
View on GitHub
Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)
☆12Jun 1, 2023Updated 3 years ago
sony / CLIPSep
View on GitHub
☆43Feb 21, 2023Updated 3 years ago
chorowski-lab / CPC_audio
View on GitHub
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.
☆10Feb 22, 2022Updated 4 years ago
ariesssxu / vta-ldm
View on GitHub
☆61Jun 15, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
MorenoLaQuatra / audiocaps-download
View on GitHub
This package aims at simplifying the download of the AudioCaps dataset.
☆35Dec 1, 2023Updated 2 years ago
VisualAIKHU / SIRA-SSL
View on GitHub
Official Repository for "Audio-Visual Spatial Integration and Recursive Attention for Robust Sound Source Localization" (ACM MM 2023)
☆18Nov 14, 2023Updated 2 years ago
usc-sail / mica-subtitle-aligned-movie-sounds
View on GitHub
A dataset for Audio-Visual Sound Event Detection in Movies
☆26Jan 23, 2023Updated 3 years ago
hhc1997 / vggsound_download
View on GitHub
download the vggsound dataset
☆22Feb 22, 2022Updated 4 years ago
hche11 / Localizing-Visual-Sounds-the-Hard-Way
View on GitHub
Localizing Visual Sounds the Hard Way
☆84Jul 6, 2022Updated 4 years ago
v-iashin / SparseSync
View on GitHub
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
☆56Jan 29, 2024Updated 2 years ago
OpenNLPLab / TAVGBench
View on GitHub
Demo page of TAVGBench: Benchmarking Text to Audible-Video Generation
☆15Apr 7, 2025Updated last year
andrebola / contrastive-mir-learning
View on GitHub
This repo contains the code to reproduce the paper: "Enriched Music Representations with Multiple Cross-modal Contrastive Learning"
☆15Jun 22, 2023Updated 3 years ago
facebookresearch / MAViL
View on GitHub
The repo host the code and model of MAViL.
☆45Jul 24, 2023Updated 3 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
jim-schwoebel / download_audioset
View on GitHub
📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).
☆106Aug 1, 2023Updated 2 years ago
GeWu-Lab / MMCosine_ICASSP23
View on GitHub
The code repo for ICASSP 2023 Paper "MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning"
☆26May 18, 2023Updated 3 years ago
dlrudco / Fast-Audioset-Download
View on GitHub
Download audioset data super fastly with youtube-dl, ffmpeg and python multiprocessing
☆48Aug 1, 2024Updated last year
XinhaoMei / WavCaps
View on GitHub
This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.
☆264Jul 25, 2024Updated 2 years ago
YuanGongND / cav-mae
View on GitHub
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
☆292Mar 20, 2024Updated 2 years ago
facebookresearch / AudioMAE
View on GitHub
This repo hosts the code and models of "Masked Autoencoders that Listen".
☆673Apr 5, 2024Updated 2 years ago
stoneMo / AVGN
View on GitHub
Official implementation for AVGN
☆42Mar 24, 2023Updated 3 years ago
zhuole1025 / SymMV
View on GitHub
[ICCV 2023] Video Background Music Generation: Dataset, Method and Evaluation
☆78Mar 29, 2024Updated 2 years ago
JongSuk1 / AVCap
View on GitHub
☆11Sep 1, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
GeWu-Lab / CSOL_TPAMI2021
View on GitHub
The repo for "Class-aware Sounding Objects Localization", TPAMI 2021.
☆29Mar 4, 2022Updated 4 years ago
cdjkim / audiocaps
View on GitHub
🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps
☆215Oct 6, 2025Updated 9 months ago
W-Wu / DEER
View on GitHub
☆12Aug 25, 2023Updated 2 years ago
aoifemcdonagh / audioset-processing
View on GitHub
Toolkit for downloading and processing Google's AudioSet dataset.
☆180Aug 22, 2025Updated 11 months ago
PeihaoChen / regnet
View on GitHub
Official PyTorch implementation of the TIP paper "Generating Visually Aligned Sound from Videos" and the corresponding Visually Aligned S…
☆53Dec 15, 2020Updated 5 years ago
yzxing87 / Seeing-and-Hearing
View on GitHub
[CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
☆155Jul 6, 2024Updated 2 years ago
ilaria-manco / muscaps
View on GitHub
Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)
☆85Dec 3, 2024Updated last year
LAION-AI / audio-dataset
View on GitHub
Audio Dataset for training CLAP and other models
☆748Jan 8, 2026Updated 6 months ago
CarlWangChina / REMAST-Real-time-Emotion-based-Music-Arrangement-with-Soft-Transition
View on GitHub
SongDriver2 achieves a balance between real-time emotion fit and soft transitions, enhancing the coherence of the generated music.
☆11Nov 15, 2025Updated 8 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
lijuncheng16 / AudioTaggingDoneRight
View on GitHub
experiments about AudioSet
☆43Jul 22, 2023Updated 3 years ago
RetroCirce / HTS-Audio-Transformer
View on GitHub
The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"
☆504Sep 18, 2025Updated 10 months ago
marl / jams-data
View on GitHub
Datasets and parsing scripts for JAMS
☆27Feb 1, 2020Updated 6 years ago
roudimit / MUSIC_dataset
View on GitHub
MUSIC Dataset from The Sound of Pixels (ECCV '18)
☆137Aug 12, 2022Updated 3 years ago
aromanusc / SoundQ
View on GitHub
Enhanced sound event localization and detection in real 360-degree audio-visual soundscapes (DCASE task3 format)
☆14Mar 21, 2025Updated last year
IvanBirkmaier / Audioset
View on GitHub
This repository is built with a focus on practical ways to obtain and work with the audio data of audioset. You can use this repository t…
☆17Jun 12, 2025Updated last year
AccentDB / code
View on GitHub
Code for AccentDB.
☆24May 28, 2021Updated 5 years ago