MorenoLaQuatra/audioset-download

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MorenoLaQuatra/audioset-download)

MorenoLaQuatra / audioset-download

This package aims at simplifying the download of the AudioSet dataset.

☆60

Alternatives and similar repositories for audioset-download

Users that are interested in audioset-download are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

dlrudco / Fast-Audioset-Download
View on GitHub
Download audioset data super fastly with youtube-dl, ffmpeg and python multiprocessing
☆48Aug 1, 2024Updated last year
MorenoLaQuatra / audiocaps-download
View on GitHub
This package aims at simplifying the download of the AudioCaps dataset.
☆35Dec 1, 2023Updated 2 years ago
aoifemcdonagh / audioset-processing
View on GitHub
Toolkit for downloading and processing Google's AudioSet dataset.
☆180Aug 22, 2025Updated 11 months ago
swagshaw / WildDESED
View on GitHub
WildDESED: A LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection
☆18Nov 19, 2024Updated last year
MorenoLaQuatra / vad
View on GitHub
Simple voice activity detection (VAD) algorithm in Python
☆15Aug 10, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
PeiwenSun2000 / Both-Ears-Wide-Open
View on GitHub
The official repo for Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation
☆65Jul 2, 2025Updated last year
chenjianyi / fastsag
View on GitHub
FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation
☆30Dec 19, 2024Updated last year
kaistmm / VoiceDiT
View on GitHub
[ICASSP2025] Official code for VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis
☆52Apr 9, 2025Updated last year
amazon-science / contextual-attention-nlm
View on GitHub
Accompanying code for paper "Attention-Based Contextual Language Model Adaptation for Speech Recognition", submitted to ACL 2021.
☆14Jul 25, 2023Updated 3 years ago
robd003 / sph2pipe
View on GitHub
provide SPHERE-formatted output as well as RIFF, AU, AIFF and raw
☆14Dec 18, 2021Updated 4 years ago
zelaki / DreamSound
View on GitHub
[ICASSP'24] Investigating Personalization Methods in Text to Music Generation
☆47Mar 27, 2024Updated 2 years ago
haoheliu / AudioLDM-training-finetuning
View on GitHub
AudioLDM training, finetuning, evaluation and inference.
☆304Dec 13, 2024Updated last year
Sreyan88 / RECAP
View on GitHub
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
☆16Jun 23, 2024Updated 2 years ago
slp-rl / SpokenStoryCloze
View on GitHub
A spoken version of the textual story cloze benchmark
☆22Aug 6, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
MorenoLaQuatra / bart-it
View on GitHub
Pre-training BART model for the Italian Language
☆16Dec 28, 2022Updated 3 years ago
kuan2jiu99 / audio-hallucination
View on GitHub
Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024
☆34Mar 14, 2025Updated last year
MuSAELab / AUDDT
View on GitHub
A toolkit for benchmarking on a wide variety of audio deepfake datasets.
☆36May 22, 2026Updated 2 months ago
SarthakYadav / fsd50k-pytorch
View on GitHub
Unofficial implementation of FSD50k baselines for Sound Event Recognition
☆27Apr 27, 2024Updated 2 years ago
JishengBai / AudioSetCaps
View on GitHub
A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline
☆208Dec 13, 2024Updated last year
naba89 / iSeparate-SDX
View on GitHub
iSeparate library for the SDX2023 challenge
☆15Dec 15, 2023Updated 2 years ago
facebookresearch / learning-audio-visual-dereverberation
View on GitHub
Code for paper Learning Audio-Visual Dereverberation
☆32Aug 10, 2022Updated 3 years ago
ksasso1028 / audio-reverb-removal
View on GitHub
Code to train a custom time-domain autoencoder to dereverb audio
☆16Nov 30, 2023Updated 2 years ago
davidliujiafeng / ccom_mdx2023
View on GitHub
☆10Jun 6, 2023Updated 3 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
zszheng147 / Spatial-AST
View on GitHub
🦇 Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)
☆87Feb 13, 2025Updated last year
Alittleegg / Eureka-Audio
View on GitHub
Eureka-Audio: A 1.7B lightweight audio–language model that matches 7B–30B models on ASR, audio understanding, and paralinguistic reasonin…
☆40Apr 11, 2026Updated 3 months ago
ImperialCollegeLondon / spear-tools
View on GitHub
SPEAR Challenge scripts and tools.
☆25Mar 17, 2023Updated 3 years ago
kaistmm / voxceleb-disentangler
View on GitHub
[INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…
☆18Jul 23, 2024Updated 2 years ago
MorenoLaQuatra / ARCH
View on GitHub
ARCH: Audio Representations benCHmark
☆57Aug 26, 2024Updated last year
slSeanWU / beats-conformer-bart-audio-captioner
View on GitHub
PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…
☆41Jan 6, 2024Updated 2 years ago
mzsun01 / MM-LDM
View on GitHub
☆11Apr 12, 2024Updated 2 years ago
SarthakYadav / audiomae-plusplus-official
View on GitHub
Official repository for the paper "AudioMAE++: learning better masked audio representations with SwiGLU FFNs"
☆15Apr 30, 2026Updated 3 months ago
zeyuxie29 / AudioTime
View on GitHub
☆39Jul 4, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
merlresearch / reverberation-as-supervision
View on GitHub
Enhanced Reverberation As Supervision (ERAS) for unsupervised reverberant speech separation
☆15Aug 1, 2024Updated last year
shinhyeokoh / rwen
View on GitHub
☆14Jun 16, 2023Updated 3 years ago
RetroCirce / Zero_Shot_Audio_Source_Separation
View on GitHub
The official code repo for "Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data", in AAAI 2022
☆213Jul 14, 2022Updated 4 years ago
ftshijt / Interspeech2024_DiscreteSpeechChallenge
View on GitHub
This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.
☆32Jan 26, 2024Updated 2 years ago
AlekseyKorshuk / accompaniment-generator
View on GitHub
Generate accompaniment part with chords using Evolutionary algorithm.
☆11May 8, 2022Updated 4 years ago
JeongHun0716 / e-mvsr
View on GitHub
Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)
☆20Mar 17, 2025Updated last year
Jerry-jwz / Audio-Enhancement-via-ONMF
View on GitHub
☆23Feb 2, 2022Updated 4 years ago