Python code for handling the Clotho dataset.
☆85Nov 24, 2020Updated 5 years ago
Alternatives and similar repositories for clotho-dataset
Users that are interested in clotho-dataset are comparing it to the libraries listed below
Sorting:
- Code for the paper: Unified Gradient Reweighting for Model Biasing with Applications to Source Separation☆14Nov 16, 2020Updated 5 years ago
- Tools for the evaluation of audio captioning.☆18May 23, 2020Updated 5 years ago
- Dataset and baseline for the first Audiocaption task☆79Jul 25, 2024Updated last year
- 🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps☆203Oct 6, 2025Updated 4 months ago
- Code for the paper "MULTI-BAND MASKING FOR WAVEFORM-BASED SINGING VOICE SEPARATION" that was accepted on EUSIPCO2022☆15Jun 18, 2022Updated 3 years ago
- Consistent dictionary learning algorithm for signal declipping (Python code)☆20Oct 24, 2018Updated 7 years ago
- Artie Bias Corpus: an audio corpus + code for detecting demographic bias☆20Jul 21, 2020Updated 5 years ago
- ☆14Jun 12, 2015Updated 10 years ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Jun 2, 2023Updated 2 years ago
- This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.☆257Jul 25, 2024Updated last year
- A fork of Idiap Research Institute's DiarTk diarization toolkit☆16Feb 20, 2016Updated 10 years ago
- PodcastMix A dataset for separating music and speech in podcasts.☆44Aug 20, 2024Updated last year
- Audio captioning baseline system for DCASE 2020 challenge.☆38Aug 22, 2023Updated 2 years ago
- Pronunciation-assisted Subword Modeling☆31May 30, 2019Updated 6 years ago
- Audio captioning recipe☆51Oct 23, 2025Updated 4 months ago
- Unsupervised speech activity detection system.☆11Jul 2, 2018Updated 7 years ago
- Keras-based python framework to compute phonological posterior probabilities from audio files☆46Dec 27, 2022Updated 3 years ago
- ASR text preprocessing utility☆21Aug 5, 2024Updated last year
- Source code for the paper 'Audio Captioning Transformer'☆57Jan 18, 2022Updated 4 years ago
- A list of papers about audio captioning☆79Jul 1, 2022Updated 3 years ago
- A handy dataset of noises for ASR☆22May 29, 2019Updated 6 years ago
- Implementation of "Audio Retrieval with Natural Language Queries: A Benchmark Study".☆54Jul 16, 2025Updated 7 months ago
- Read audio with FFmpeg into NumPy/PyTorch via ctypes (standard library module)☆11Aug 12, 2020Updated 5 years ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆13Feb 13, 2021Updated 5 years ago
- This is an extension of kaldi speech recognition software which allows to perform decoding of speech with hybrid word and phoneme graphs.…☆11Feb 4, 2020Updated 6 years ago
- ☆10Sep 19, 2022Updated 3 years ago
- Repository for subjective and objective evaluation of source separation algorithms☆12Apr 18, 2018Updated 7 years ago
- Score Normalization for NIST 2019 Speaker Recognition Evaluation☆10Nov 8, 2019Updated 6 years ago
- Code for phase recovery in MadTwinNet for monaural singing voice separation☆12Jul 17, 2018Updated 7 years ago
- 👄🇧🇷 Alinhamento fonético forçado em Português Brasileiro☆12Jul 18, 2025Updated 7 months ago
- VGGSound: A Large-scale Audio-Visual Dataset☆351Sep 13, 2021Updated 4 years ago
- Streaming source separation for music and speech files, using the Open-Unmix LSTM architecture.☆21Dec 8, 2022Updated 3 years ago
- Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"☆26Mar 27, 2024Updated last year
- Word Discovery in Visually Grounded, Self-Supervised Speech Models☆26Dec 4, 2023Updated 2 years ago
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 9 months ago
- ☆14Apr 18, 2019Updated 6 years ago
- PyTorch implementation of the NSGT/sliCQT☆17Nov 10, 2023Updated 2 years ago
- Web page for ISCA Special Interest Group: Robust Speech Processing (RoSP)☆11Dec 4, 2023Updated 2 years ago
- An audio classification system for learning with out-of-distribution data☆33Dec 8, 2022Updated 3 years ago