zhaoyanpeng / audioset-dl
Download AudioSet for Vision-Audio-Text Pre-training
☆12Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for audioset-dl
- VIsually-Pivoted Audio and(N) Text☆21Updated 2 years ago
- Download and create a tfreader for the audioset dataset☆16Updated 4 years ago
- A list of resources that can help in research for automated audio captioning☆34Updated 3 years ago
- Implementation for paper "iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric L…☆53Updated last year
- The Pytorch implementation of paper: Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training☆39Updated last year
- Simple baseline model for the HEAR benchmark☆23Updated 3 weeks ago
- Code for the paper: Unified Gradient Reweighting for Model Biasing with Applications to Source Separation☆14Updated 4 years ago
- Conditioned U-Net for Music Source Separation☆20Updated 3 years ago
- ☆53Updated 3 years ago
- ☆32Updated 3 years ago
- ☆17Updated 3 years ago
- ☆47Updated last year
- Code for the paper "Improving Sound Event Classification by Increasing Shift Invariance in Convolutional Neural Networks".☆13Updated last year
- The demo for "Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem".☆12Updated 3 years ago
- System that ranks 2nd in DCASE 2022 Challenge Task 5: Few-shot Bioacoustic Event Detection☆27Updated 2 years ago
- Zero-shot Learning for Audio-based Music Classification and Tagging (ISMIR 2019)☆40Updated 5 years ago
- Emotion detection in audio utilising self-supervised representations trained with Contrastive Predictive Coding (CPC).☆42Updated 2 years ago
- ☆12Updated 4 years ago
- UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation☆72Updated 3 years ago
- ☆58Updated 4 years ago
- Dataset and baseline for the first Audiocaption task☆79Updated 3 months ago
- CNN-based singing voice detection experiments☆35Updated 6 years ago
- ☆17Updated 3 years ago
- A PyTorch implementation: "LASAFT-Net-v2: Listen, Attend and Separate by Attentively aggregating Frequency Transformation"☆33Updated 2 years ago
- Constrained Permutation Invariant Training, Speech Separation☆43Updated 3 years ago
- Baseline systems for the FSD50K dataset☆67Updated 3 years ago
- Pytorch code for the paper 'Attention-based Atrous Convolutional Neural Networks: Visualisation and Understanding Perspectives of Acousti…☆14Updated 4 years ago
- Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation☆38Updated 4 years ago
- ☆36Updated 2 years ago
- ☆13Updated 5 years ago