aoifemcdonagh/audioset-processing

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/aoifemcdonagh/audioset-processing)

aoifemcdonagh / audioset-processing

Toolkit for downloading and processing Google's AudioSet dataset.

☆180

Alternatives and similar repositories for audioset-processing

Users that are interested in audioset-processing are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MorenoLaQuatra / audioset-download
View on GitHub
This package aims at simplifying the download of the AudioSet dataset.
☆60Jul 17, 2025Updated last year
jim-schwoebel / download_audioset
View on GitHub
📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).
☆106Aug 1, 2023Updated 2 years ago
WangHelin1997 / SpecAugment-plus
View on GitHub
A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification
☆34Jun 25, 2021Updated 5 years ago
fgnt / pb_sed
View on GitHub
Paderborn Sound Event Detection
☆80Jul 18, 2023Updated 3 years ago
marmoi / dcase2021_task1a_baseline
View on GitHub
☆14Jun 9, 2021Updated 5 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
YuanGongND / ltu
View on GitHub
Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".
☆478Apr 24, 2024Updated 2 years ago
Splend1d / T5lephone
View on GitHub
Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5
☆19Nov 29, 2022Updated 3 years ago
turpaultn / DESED
View on GitHub
Repo associated to the DESED dataset, download and creation of data
☆154Jul 16, 2024Updated 2 years ago
facebookresearch / AudioMAE
View on GitHub
This repo hosts the code and models of "Masked Autoencoders that Listen".
☆673Apr 5, 2024Updated 2 years ago
fschmid56 / EfficientAT
View on GitHub
This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training …
☆353Nov 20, 2024Updated last year
gudgud96 / frechet-audio-distance
View on GitHub
A lightweight library for Frechet Audio Distance calculation.
☆317Feb 11, 2026Updated 5 months ago
JSALT-2022-SSL / superb-prosody
View on GitHub
☆31Jul 13, 2023Updated 3 years ago
Audio-WestlakeU / ATST-SED
View on GitHub
This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".
☆174Jun 8, 2026Updated last month
archinetai / audio-data-pytorch
View on GitHub
A collection of useful audio datasets and transforms for PyTorch.
☆144Feb 11, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
unixpickle / audioset
View on GitHub
Fetch and use Google's AudioSet dataset
☆127Apr 13, 2017Updated 9 years ago
miccio-dk / NISQA
View on GitHub
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
☆16Apr 13, 2022Updated 4 years ago
haoheliu / AudioLDM-training-finetuning
View on GitHub
AudioLDM training, finetuning, evaluation and inference.
☆304Dec 13, 2024Updated last year
archinetai / audio-diffusion-pytorch-trainer
View on GitHub
Trainer for audio-diffusion-pytorch
☆129Jan 13, 2023Updated 3 years ago
LAION-AI / CLAP
View on GitHub
Contrastive Language-Audio Pretraining
☆2,231May 15, 2025Updated last year
janson9192 / autokws2021
View on GitHub
☆13Mar 25, 2021Updated 5 years ago
hche11 / VGGSound
View on GitHub
VGGSound: A Large-scale Audio-Visual Dataset
☆359Sep 13, 2021Updated 4 years ago
sharathadavanne / seld-dcase2021
View on GitHub
Baseline method for sound event localization task of DCASE 2021 challenge
☆45Jun 15, 2021Updated 5 years ago
WangHelin1997 / LibriLightMix-WHAMR
View on GitHub
Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM
☆17Nov 7, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
speedyseal / audiosetdl
View on GitHub
Scripts for download AudioSet
☆89Nov 7, 2017Updated 8 years ago
YuanGongND / ssast
View on GitHub
Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".
☆428Aug 14, 2022Updated 3 years ago
audio-captioning / clotho-dataset
View on GitHub
Python code for handling the Clotho dataset.
☆85Nov 24, 2020Updated 5 years ago
nomonosound / log-wmse-audio-quality
View on GitHub
logWMSE, an audio quality metric with support for digital silence target. Useful for evaluating audio source separation systems, even whe…
☆39Jun 24, 2025Updated last year
YuanGongND / cav-mae
View on GitHub
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
☆292Mar 20, 2024Updated 2 years ago
voidful / asrp
View on GitHub
ASR text preprocessing utility
☆21Aug 5, 2024Updated last year
qiuqiangkong / audioset_tagging_cnn
View on GitHub
☆1,765Jul 25, 2024Updated 2 years ago
nervjack2 / Speech2Unit
View on GitHub
☆13Sep 25, 2024Updated last year
maum-ai / univnet
View on GitHub
Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)
☆286Oct 8, 2021Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ga642381 / FastSpeech2
View on GitHub
Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech
☆99Oct 14, 2022Updated 3 years ago
zszheng147 / Spatial-AST
View on GitHub
🦇 Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)
☆87Feb 13, 2025Updated last year
YuanGongND / psla
View on GitHub
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
☆150Jul 13, 2023Updated 3 years ago
LAION-AI / audio-dataset
View on GitHub
Audio Dataset for training CLAP and other models
☆748Jan 8, 2026Updated 6 months ago
asappresearch / slue-toolkit
View on GitHub
A toolkit for Spoken Language Understanding Evaluation (SLUE) benchmark. Refer paper https://arxiv.org/abs/2111.10367 for more details. O…
☆65Feb 26, 2024Updated 2 years ago
aleXiehta / PhoneFortifiedPerceptualLoss
View on GitHub
Improving Perceptual Quality by Phone-Fortified Perceptual Loss using Wasserstein Distance for Speech Enhancement
☆82Jun 28, 2021Updated 5 years ago
v-iashin / SparseSync
View on GitHub
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
☆56Jan 29, 2024Updated 2 years ago