Dataset and baseline for the first Audiocaption task
☆79Jul 25, 2024Updated last year
Alternatives and similar repositories for AudioCaption
Users that are interested in AudioCaption are comparing it to the libraries listed below
Sorting:
- An audio classification system for learning with out-of-distribution data☆33Dec 8, 2022Updated 3 years ago
- ☆26Apr 21, 2021Updated 4 years ago
- Python code for handling the Clotho dataset.☆85Nov 24, 2020Updated 5 years ago
- Tools for the evaluation of audio captioning.☆18May 23, 2020Updated 5 years ago
- ☆17Jun 30, 2020Updated 5 years ago
- A list of resources that can help in research for automated audio captioning☆34Feb 17, 2021Updated 5 years ago
- A list of papers about audio captioning☆79Jul 1, 2022Updated 3 years ago
- Unsupervised speech activity detection system.☆11Jul 2, 2018Updated 7 years ago
- ☆55Jul 6, 2023Updated 2 years ago
- 2nd place solution for 2020 DCASE challenge task 6 audio captioning. http://dcase.community/challenge2020/task-automatic-audio-captioning…☆24Aug 3, 2023Updated 2 years ago
- Interspeech 2019 tutorial materials☆49Sep 26, 2019Updated 6 years ago
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆16Dec 3, 2024Updated last year
- ICASSP 2020 ESPnet-TTS: Merlin baseline system☆36Oct 28, 2019Updated 6 years ago
- Mel-Generalized Cepstrum analysis☆20Jul 21, 2017Updated 8 years ago
- ☆12Jun 2, 2019Updated 6 years ago
- BurrMill core☆22Nov 2, 2021Updated 4 years ago
- Audio captioning baseline system for DCASE 2020 challenge.☆38Aug 22, 2023Updated 2 years ago
- WaveNet implementation using tf.estimator☆21Jul 6, 2023Updated 2 years ago
- Code for CVSSP submission to DCASE 2021 Task 6☆36Nov 22, 2022Updated 3 years ago
- A Pytorch implementation of WaveVAE ("Parallel Neural Text-to-Speech")☆126Feb 24, 2024Updated 2 years ago
- RawNet: Fast End-to-End Neural Vocoder☆42May 29, 2019Updated 6 years ago
- Based on https://github.com/fatchord/WaveRNN☆24May 3, 2020Updated 5 years ago
- Python wrapper for Sinsy☆53Oct 9, 2023Updated 2 years ago
- Baseline kaldi script for UA-SPEECH corpus☆32Oct 16, 2024Updated last year
- A Pytorch implementation of "Denoising Auto-encoder with Recurrent Skip Connections and Residual Regression for Music Source Separation"☆13Jul 3, 2019Updated 6 years ago
- Code for the paper "Unsupervised Contrastive Learning of Sound Event Representations", ICASSP 2021.☆93Dec 22, 2022Updated 3 years ago
- Trained speaker embedding deep learning models and evaluation pipelines in pytorch and tesorflow for speaker recognition.☆36Oct 4, 2019Updated 6 years ago
- A test bed for updates and new features | pytorch/audio☆171May 17, 2020Updated 5 years ago
- PyTorch Dataset for Speech and Music audio☆80Jul 12, 2024Updated last year
- Code for ICASSP 2019 paper☆18Oct 29, 2018Updated 7 years ago
- Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation☆39Jul 16, 2020Updated 5 years ago
- Code to train and run Blow☆145Sep 4, 2019Updated 6 years ago
- Interface for Controllable Expressive Talking Machine☆40Sep 20, 2025Updated 5 months ago
- Fast spectrogram phase recovery using Local Weighted Sums (C/Python/Matlab)☆117Nov 28, 2023Updated 2 years ago
- Vocode spectrograms to audio with generative adversarial networks☆64Aug 8, 2019Updated 6 years ago
- Code for "CL4AC: A Contrastive Loss for Audio Captioning", DCASE Workshop 2021.☆45Oct 8, 2021Updated 4 years ago
- In this repository, I try to combine k2 with speechbrain to decode well and fastly.☆16Jun 17, 2022Updated 3 years ago
- A TensorFlow implementation of Griffin-Lim algorithm☆79May 14, 2018Updated 7 years ago
- The Additive Margin MobileNet1D is a new light weight deep learning model for Speaker Recognition which is based on the MobileNetV2 archi…☆30Oct 3, 2023Updated 2 years ago