audio-captioning/clotho-dataset

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/audio-captioning/clotho-dataset)

audio-captioning / clotho-dataset

Python code for handling the Clotho dataset.

☆85

Alternatives and similar repositories for clotho-dataset

Users that are interested in clotho-dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

audio-captioning / caption-evaluation-tools
View on GitHub
Tools for the evaluation of audio captioning.
☆19May 23, 2020Updated 6 years ago
etzinis / biased_separation
View on GitHub
Code for the paper: Unified Gradient Reweighting for Model Biasing with Applications to Source Separation
☆14Nov 16, 2020Updated 5 years ago
cdjkim / audiocaps
View on GitHub
🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps
☆215Oct 6, 2025Updated 9 months ago
magronp / phase-madtwinnet
View on GitHub
Code for phase recovery in MadTwinNet for monaural singing voice separation
☆12Jul 17, 2018Updated 8 years ago
RicherMans / AudioCaption
View on GitHub
Dataset and baseline for the first Audiocaption task
☆79Jul 25, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
XinhaoMei / ACT
View on GitHub
Source code for the paper 'Audio Captioning Transformer'
☆56Jan 18, 2022Updated 4 years ago
audio-captioning / dcase-2020-baseline
View on GitHub
Audio captioning baseline system for DCASE 2020 challenge.
☆38Aug 22, 2023Updated 2 years ago
XinhaoMei / WavCaps
View on GitHub
This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.
☆264Jul 25, 2024Updated last year
CVSSP / perceptual-study-source-separation
View on GitHub
Repository for subjective and objective evaluation of source separation algorithms
☆12Apr 18, 2018Updated 8 years ago
audio-captioning / audio-captioning-papers
View on GitHub
A list of papers about audio captioning
☆78Jul 1, 2022Updated 4 years ago
wsntxxn / AudioCaption
View on GitHub
Audio captioning recipe
☆53Oct 23, 2025Updated 9 months ago
lukewys / dcase_2020_T6
View on GitHub
2nd place solution for 2020 DCASE challenge task 6 audio captioning. http://dcase.community/challenge2020/task-automatic-audio-captioning…
☆24Aug 3, 2023Updated 2 years ago
dr-costas / undaw
View on GitHub
Unsupervised Domain Adaptation for Acoustic Scene Classification with Wasserstein Distance
☆14Sep 16, 2020Updated 5 years ago
audio-captioning / audio-captioning-resources
View on GitHub
A list of resources that can help in research for automated audio captioning
☆34Feb 17, 2021Updated 5 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
andrebola / contrastive-mir-learning
View on GitHub
This repo contains the code to reproduce the paper: "Enriched Music Representations with Multiple Cross-modal Contrastive Learning"
☆15Jun 22, 2023Updated 3 years ago
LucasRr / Dictionary_learning_for_declipping_Python
View on GitHub
Consistent dictionary learning algorithm for signal declipping (Python code)
☆20Oct 24, 2018Updated 7 years ago
qiuqiangkong / sampleRNN_acoustic_scene_generation
View on GitHub
☆14Apr 18, 2019Updated 7 years ago
PanagiotisP / svs-multiband
View on GitHub
Code for the paper "MULTI-BAND MASKING FOR WAVEFORM-BASED SINGING VOICE SEPARATION" that was accepted on EUSIPCO2022
☆15Jun 18, 2022Updated 4 years ago
ashi-ta / speechGLUE
View on GitHub
SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.
☆13Jun 2, 2023Updated 3 years ago
hche11 / VGGSound
View on GitHub
VGGSound: A Large-scale Audio-Visual Dataset
☆359Sep 13, 2021Updated 4 years ago
Audio-AGI / dcase2024_task9_baseline
View on GitHub
Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"
☆26Mar 27, 2024Updated 2 years ago
felixgontier / dcase-2023-baseline
View on GitHub
☆14Mar 25, 2023Updated 3 years ago
artie-inc / artie-bias-corpus
View on GitHub
Artie Bias Corpus: an audio corpus + code for detecting demographic bias
☆20Jul 21, 2020Updated 6 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
MTG / Podcastmix
View on GitHub
PodcastMix A dataset for separating music and speech in podcasts.
☆44Aug 20, 2024Updated last year
vadimkantorov / readaudio
View on GitHub
Read audio with FFmpeg into NumPy/PyTorch via ctypes (standard library module)
☆11Aug 12, 2020Updated 5 years ago
voidful / asrp
View on GitHub
ASR text preprocessing utility
☆21Aug 5, 2024Updated last year
ffaisal93 / SD-QA
View on GitHub
☆16Feb 10, 2026Updated 5 months ago
hainan-xv / PASM
View on GitHub
Pronunciation-assisted Subword Modeling
☆31May 30, 2019Updated 7 years ago
akoepke / audio-retrieval-benchmark
View on GitHub
Code for "Audio Retrieval with Natural Language Queries: A Benchmark Study", Transactions on Multimedia 2022
☆54Jul 16, 2025Updated last year
falabrasil / ufpalign
View on GitHub
👄🇧🇷 Alinhamento fonético forçado em Português Brasileiro
☆13Jul 18, 2025Updated last year
xieh97 / dcase2023-audio-retrieval
View on GitHub
Baseline system for Language-based Audio Retrieval (Task 6B) in DCASE 2023 Challenge
☆10Aug 8, 2023Updated 2 years ago
JishengBai / AudioSetCaps
View on GitHub
A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline
☆208Dec 13, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
MorenoLaQuatra / audiocaps-download
View on GitHub
This package aims at simplifying the download of the AudioCaps dataset.
☆35Dec 1, 2023Updated 2 years ago
WangHelin1997 / Aty-TTS
View on GitHub
Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
☆11May 14, 2025Updated last year
Labbeti / aac-datasets
View on GitHub
Audio Captioning datasets for PyTorch.
☆129Mar 25, 2026Updated 3 months ago
jcvasquezc / phonet
View on GitHub
Keras-based python framework to compute phonological posterior probabilities from audio files
☆48Dec 27, 2022Updated 3 years ago
tqbl / ood_audio
View on GitHub
An audio classification system for learning with out-of-distribution data
☆33Dec 8, 2022Updated 3 years ago
dr-costas / dnd-sed
View on GitHub
Sound event detection with depthwise separable and dilated convolutions.
☆53Mar 30, 2020Updated 6 years ago
Chung-I / youtube-asr-crawler
View on GitHub
☆10Sep 19, 2022Updated 3 years ago