Dataset and baseline for the first Audiocaption task
☆79Jul 25, 2024Updated last year
Alternatives and similar repositories for AudioCaption
Users that are interested in AudioCaption are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A list of resources that can help in research for automated audio captioning☆34Feb 17, 2021Updated 5 years ago
- 2nd place solution for 2020 DCASE challenge task 6 audio captioning. http://dcase.community/challenge2020/task-automatic-audio-captioning…☆24Aug 3, 2023Updated 2 years ago
- An audio classification system for learning with out-of-distribution data☆33Dec 8, 2022Updated 3 years ago
- A list of papers about audio captioning☆79Jul 1, 2022Updated 3 years ago
- Tools for the evaluation of audio captioning.☆19May 23, 2020Updated 5 years ago
- Python code for handling the Clotho dataset.☆85Nov 24, 2020Updated 5 years ago
- ☆12Jun 2, 2019Updated 6 years ago
- Code for CVSSP submission to DCASE 2021 Task 6☆36Nov 22, 2022Updated 3 years ago
- ☆26Apr 21, 2021Updated 4 years ago
- Audio captioning baseline system for DCASE 2020 challenge.☆38Aug 22, 2023Updated 2 years ago
- A Pytorch implementation of WaveVAE ("Parallel Neural Text-to-Speech")☆126Feb 24, 2024Updated 2 years ago
- Audio captioning recipe☆51Oct 23, 2025Updated 5 months ago
- Materials of public talks given By SJTU X-LANCE members☆14Dec 3, 2022Updated 3 years ago
- Pytorch implementation of the paper : A Global-local Attention Framework for Weakly Labelled Audio Tagging.☆13Feb 6, 2021Updated 5 years ago
- ☆55Jul 6, 2023Updated 2 years ago
- Code for "CL4AC: A Contrastive Loss for Audio Captioning", DCASE Workshop 2021.☆45Oct 8, 2021Updated 4 years ago
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆16Dec 3, 2024Updated last year
- BurrMill core☆22Nov 2, 2021Updated 4 years ago
- Code to train and run Blow☆145Sep 4, 2019Updated 6 years ago
- Interspeech 2019 tutorial materials☆49Sep 26, 2019Updated 6 years ago
- Mel-Generalized Cepstrum analysis☆20Jul 21, 2017Updated 8 years ago
- Based on https://github.com/fatchord/WaveRNN☆24May 3, 2020Updated 5 years ago
- This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.☆257Jul 25, 2024Updated last year
- ☆14Apr 18, 2019Updated 6 years ago
- ☆17Feb 14, 2020Updated 6 years ago
- RawNet: Fast End-to-End Neural Vocoder☆42May 29, 2019Updated 6 years ago
- WaveNet implementation using tf.estimator☆21Jul 6, 2023Updated 2 years ago
- Web-crawl for "Audio Retrieval with WavText5K and CLAP Training"☆50Nov 10, 2022Updated 3 years ago
- Baseline kaldi script for UA-SPEECH corpus☆32Oct 16, 2024Updated last year
- Code for the paper "Unsupervised Contrastive Learning of Sound Event Representations", ICASSP 2021.☆93Dec 22, 2022Updated 3 years ago
- Unsupervised speech activity detection system.☆11Jul 2, 2018Updated 7 years ago
- CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding☆22Dec 17, 2025Updated 3 months ago
- Experiment with JNI access to some Kaldi functions.☆12Dec 31, 2018Updated 7 years ago
- Convolutional neural networks for sound classification☆20Dec 30, 2017Updated 8 years ago
- FFTNet: a Real-Time Speaker-Dependent Neural Vocoder☆64Aug 7, 2018Updated 7 years ago
- A test bed for updates and new features | pytorch/audio☆171May 17, 2020Updated 5 years ago
- ☆17Jun 30, 2020Updated 5 years ago
- Code for Yun Wang's PhD Thesis: Polyphonic Sound Event Detection with Weak Labeling☆169May 14, 2022Updated 3 years ago
- PyTorch Dataset for Speech and Music audio☆80Jul 12, 2024Updated last year