zhaoyanpeng / audioset-dl
Download AudioSet for Vision-Audio-Text Pre-training
☆12Updated 2 years ago
Alternatives and similar repositories for audioset-dl:
Users that are interested in audioset-dl are comparing it to the libraries listed below
- VIsually-Pivoted Audio and(N) Text☆22Updated 2 years ago
- Language modelling for sound event detection☆21Updated 5 years ago
- Benchmarking different VAD models on AVA-Speech dataset☆11Updated last year
- Audio captioning baseline system for DCASE 2020 challenge.☆38Updated last year
- ☆32Updated 4 years ago
- Simple baseline model for the HEAR benchmark☆23Updated last week
- Download and create a tfreader for the audioset dataset☆16Updated 4 years ago
- A list of resources that can help in research for automated audio captioning☆34Updated 3 years ago
- ☆30Updated 2 years ago
- Code for the paper "Improving Sound Event Classification by Increasing Shift Invariance in Convolutional Neural Networks".☆13Updated last year
- System that ranks 2nd in DCASE 2022 Challenge Task 5: Few-shot Bioacoustic Event Detection☆28Updated 2 years ago
- ☆47Updated 2 years ago
- Code for the paper: Unified Gradient Reweighting for Model Biasing with Applications to Source Separation☆14Updated 4 years ago
- COLA contrastive pre-training method implemented in PyTorch☆42Updated 3 years ago
- JAMS annotation files for the original and augmented UrbanSound8K dataset☆35Updated 6 years ago
- A new metric for evaluating end-to-end speech recognition and disfluency removal systems☆19Updated 3 years ago
- Implementation for paper "iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric L…☆53Updated last year
- ☆14Updated last year
- Author's repository for reproducing DcaseNet, an integrated pre-trained DNN that performs acoustic scene classification, audio tagging, a…☆39Updated 3 years ago
- Data processing tools for preparing speech and labels for training TTS voices☆24Updated 4 years ago
- A PyTorch implementation: "LASAFT-Net-v2: Listen, Attend and Separate by Attentively aggregating Frequency Transformation"☆33Updated 2 years ago
- Paderbox: A collection of utilities for audio / speech processing☆38Updated 7 months ago
- Official repo for "A MODULATION-DOMAIN LOSS FOR NEURAL-NETWORK-BASED REAL-TIME SPEECH ENHANCEMENT" to appear in ICASSP 2021☆38Updated 3 years ago
- Sound event detection with depthwise separable and dilated convolutions.☆54Updated 4 years ago
- Constrained Permutation Invariant Training, Speech Separation☆44Updated 3 years ago
- Repository for Weak Label Learning for Audio Events - A closer look. Uses Audioset subset data provided for reproducibility.☆32Updated last year
- implementation of Monaural Speech Enhancement with Recursive Learning in the Time Domain☆44Updated 4 years ago
- ☆58Updated 4 years ago
- A list of papers about audio captioning☆78Updated 2 years ago
- Conditioned U-Net for Music Source Separation☆20Updated 3 years ago