Labbeti/conette-audio-captioning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Labbeti/conette-audio-captioning)

Labbeti / conette-audio-captioning

CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding

☆23

Alternatives and similar repositories for conette-audio-captioning

Users that are interested in conette-audio-captioning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

topel / audioset-convnext-inf
View on GitHub
Adapting a ConvNeXt model to audio classification on AudioSet
☆27Feb 19, 2025Updated last year
blmoistawinde / fense
View on GitHub
Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval…
☆21Feb 1, 2023Updated 3 years ago
minguinho26 / Prefix_AAC_ICASSP2023
View on GitHub
Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"
☆30Dec 6, 2023Updated 2 years ago
jaeyeonkim99 / EnCLAP
View on GitHub
Official Implementation of EnCLAP (ICASSP 2024)
☆96Jun 2, 2024Updated 2 years ago
bagustris / s3prl-ser
View on GitHub
S3PRL for Speech Emotion Recognition (see s3prl > downstream)
☆15Feb 28, 2026Updated 4 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Labbeti / aac-metrics
View on GitHub
Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
☆75Mar 22, 2026Updated 4 months ago
lukewys / dcase_2020_T6
View on GitHub
2nd place solution for 2020 DCASE challenge task 6 audio captioning. http://dcase.community/challenge2020/task-automatic-audio-captioning…
☆24Aug 3, 2023Updated 2 years ago
audio-captioning / audio-captioning-papers
View on GitHub
A list of papers about audio captioning
☆78Jul 1, 2022Updated 4 years ago
audio-captioning / audio-captioning-resources
View on GitHub
A list of resources that can help in research for automated audio captioning
☆34Feb 17, 2021Updated 5 years ago
wsntxxn / AudioCaption
View on GitHub
Audio captioning recipe
☆53Oct 23, 2025Updated 9 months ago
v-manhlt3 / m-LTM-Audio-Text-Retrieval
View on GitHub
☆13Jan 5, 2025Updated last year
felixgontier / dcase-2023-baseline
View on GitHub
☆14Mar 25, 2023Updated 3 years ago
Audio-AGI / dcase2024_task9_baseline
View on GitHub
Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"
☆26Mar 27, 2024Updated 2 years ago
Labbeti / aac-datasets
View on GitHub
Audio Captioning datasets for PyTorch.
☆129Mar 25, 2026Updated 4 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
XinhaoMei / DCASE2021_task6_v2
View on GitHub
Code for CVSSP submission to DCASE 2021 Task 6
☆36Nov 22, 2022Updated 3 years ago
tqbl / arca23k-dataset
View on GitHub
The code used to create the ARCA23K and ARCA23K-FSD datasets
☆16Nov 9, 2021Updated 4 years ago
lijuncheng16 / AudioTaggingDoneRight
View on GitHub
experiments about AudioSet
☆43Jul 22, 2023Updated 3 years ago
snap-research / AVLink
View on GitHub
AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation
☆17Aug 3, 2025Updated 11 months ago
prompteus / audio-captioning
View on GitHub
Audio captioning - DCASE challenge 2023 task 6a
☆30Dec 26, 2024Updated last year
snap-research / GenAU
View on GitHub
☆53Mar 24, 2026Updated 4 months ago
chrschy / pilot
View on GitHub
☆19Jun 10, 2021Updated 5 years ago
teo-sl / Audio-Super-Resolution-ViT
View on GitHub
This repository contains the source code for the implementation of two deep learning models concerning the audio super resolution task.
☆14Mar 14, 2023Updated 3 years ago
MoayedHajiAli / VidStyleODE-official
View on GitHub
☆18Jul 16, 2024Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
hearbenchmark / hear-baseline
View on GitHub
Simple baseline model for the HEAR benchmark
☆23Feb 17, 2026Updated 5 months ago
sharathadavanne / seld-dcase2020
View on GitHub
Baseline method for sound event localization task of DCASE 2020 challenge
☆60Nov 20, 2020Updated 5 years ago
RicherMans / AudioCaption
View on GitHub
Dataset and baseline for the first Audiocaption task
☆79Jul 25, 2024Updated 2 years ago
haoheliu / AudioLDM-training-finetuning
View on GitHub
AudioLDM training, finetuning, evaluation and inference.
☆304Dec 13, 2024Updated last year
frankenliu / LOAE
View on GitHub
☆10Sep 25, 2024Updated last year
kuan2jiu99 / Awesome-Speech-Generation
View on GitHub
Survey on speech generation work.
☆21Nov 26, 2023Updated 2 years ago
shaokai1209 / MDSA
View on GitHub
[IEEE, TASLP, 2023] The code of the paper "Multi-Source Discriminant Subspace Alignment for Cross-Domain Speech Emotion Recognition".
☆19Sep 27, 2024Updated last year
9rum / flatflow
View on GitHub
Fast and exact parallel training of neural networks
☆13Updated this week
wangyu / rethink-audio-fsl
View on GitHub
Who calls the shots? Rethinking Few-Shot Learning for Audio (WASPAA 2021)
☆43May 24, 2022Updated 4 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
XinhaoMei / WavCaps
View on GitHub
This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.
☆264Jul 25, 2024Updated 2 years ago
Chutlhu / dEchorate
View on GitHub
Da - ECHO - RetrievAl - daTasEt
☆36Jul 7, 2024Updated 2 years ago
qiuqiangkong / dcase2019_task1
View on GitHub
☆20May 13, 2019Updated 7 years ago
jhcodec843 / jhcodec
View on GitHub
☆48Updated this week
kyegomez / AudioFlamingo
View on GitHub
Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dial…
☆39Jan 27, 2025Updated last year
NazirNayal8 / UEM-likelihood-ratio
View on GitHub
Official Code for "A Likelihood Ratio-Based Approach to Segmenting Unknown Objects" [IJCV 2025]
☆15Jun 9, 2025Updated last year
habla-liaa / encodecmae
View on GitHub
Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'
☆101Jul 24, 2024Updated 2 years ago