minguinho26/Prefix_AAC_ICASSP2023

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/minguinho26/Prefix_AAC_ICASSP2023)

minguinho26 / Prefix_AAC_ICASSP2023

Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"

☆30

Alternatives and similar repositories for Prefix_AAC_ICASSP2023

Users that are interested in Prefix_AAC_ICASSP2023 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jaeho-lee / oce
View on GitHub
Codes for "Learning bounds for risk-sensitive learning," NeurIPS 2020 (or see arXiv 2006.08138)
☆11Oct 15, 2020Updated 5 years ago
effl-lab / MaskedKD
View on GitHub
Official Implementation of "The Role of Masking for Efficient Supervised Knowledge Distillation of Vision Transformers (ECCV 2024)”
☆26Jan 15, 2025Updated last year
jaeho-lee / MetaSparseINR
View on GitHub
Meta-Learning Sparse Implicit Neural Representations (NeurIPS 2021)
☆64Oct 29, 2021Updated 4 years ago
effl-lab / Fast-Neural-Fields
View on GitHub
Research Papers on Efficient Neural Fields from EffL Group
☆16Apr 21, 2025Updated last year
HJ-Ok / AudioBERT
View on GitHub
AudioBERT 📢 : Audio Knowledge Augmented Language Model (ICASSP 2025)
☆40Feb 1, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Labbeti / conette-audio-captioning
View on GitHub
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
☆23Dec 17, 2025Updated 7 months ago
XinhaoMei / ACT
View on GitHub
Source code for the paper 'Audio Captioning Transformer'
☆56Jan 18, 2022Updated 4 years ago
microsoft / NoAudioCaptioning
View on GitHub
Repository for "Training Audio Captioning Models without Audio"
☆10Sep 26, 2023Updated 2 years ago
Sreyan88 / RECAP
View on GitHub
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
☆16Jun 23, 2024Updated 2 years ago
bagustris / s3prl-ser
View on GitHub
S3PRL for Speech Emotion Recognition (see s3prl > downstream)
☆15Feb 28, 2026Updated 4 months ago
effl-lab / TACO
View on GitHub
Official Implementation of "Neural Image Compression with Text-guided Encoding for both Pixel-level and Perceptual Fidelity (ICML 2024)"
☆44Aug 28, 2024Updated last year
microsoft / AudioEntailment
View on GitHub
Audio Entailment: Deductive Reasoning for Audio Understanding
☆17Dec 10, 2024Updated last year
Labbeti / aac-datasets
View on GitHub
Audio Captioning datasets for PyTorch.
☆129Mar 25, 2026Updated 4 months ago
taegyeong-lee / Generating-Realistic-Images-from-In-the-wild-Sounds
View on GitHub
Official Code Repository for the paper "Generating Realistic Images from In-the-wild Sounds", ICCV 2023
☆12Aug 24, 2025Updated 11 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
lavendery / AudioComposer
View on GitHub
☆27Sep 10, 2025Updated 10 months ago
audio-captioning / audio-captioning-papers
View on GitHub
A list of papers about audio captioning
☆78Jul 1, 2022Updated 4 years ago
frankenliu / LOAE
View on GitHub
☆10Sep 25, 2024Updated last year
v-manhlt3 / m-LTM-Audio-Text-Retrieval
View on GitHub
☆13Jan 5, 2025Updated last year
jaeyeonkim99 / EnCLAP
View on GitHub
Official Implementation of EnCLAP (ICASSP 2024)
☆96Jun 2, 2024Updated 2 years ago
NKU-HLT / KNN-CTC
View on GitHub
[ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels
☆42Mar 20, 2024Updated 2 years ago
topel / audioset-convnext-inf
View on GitHub
Adapting a ConvNeXt model to audio classification on AudioSet
☆27Feb 19, 2025Updated last year
wsntxxn / TextToAudioGrounding
View on GitHub
The dataset and baseline code for Text-to-Audio Grounding (TAG)
☆49Oct 23, 2025Updated 9 months ago
xuchennlp / S2T
View on GitHub
The project for speech translation
☆12Sep 28, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
JongSuk1 / AVCap
View on GitHub
☆11Sep 1, 2024Updated last year
cdjkim / audiocaps
View on GitHub
🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps
☆215Oct 6, 2025Updated 9 months ago
chorowski-lab / CPC_audio
View on GitHub
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.
☆10Feb 22, 2022Updated 4 years ago
ahsanMah / msma
View on GitHub
Multiscale Score Matching Analysis
☆11Jan 19, 2023Updated 3 years ago
Sreyan88 / LAPE
View on GitHub
A unified framework for Low-resource Audio Processing and Evaluation (SSL Pre-training and Downstream Fine-tuning)
☆29Jul 9, 2024Updated 2 years ago
h-munakata / Lighthouse-Wrapper-for-Audio-Moment-Retrieval
View on GitHub
☆13Mar 23, 2026Updated 4 months ago
audio-captioning / dcase-2020-baseline
View on GitHub
Audio captioning baseline system for DCASE 2020 challenge.
☆38Aug 22, 2023Updated 2 years ago
XinhaoMei / audio-text_retrieval
View on GitHub
Implementation of our paper 'On Metric Learning For Audio-Text Cross-Modal Retrieval'
☆51May 17, 2022Updated 4 years ago
Ming-er / MGA-CLAP
View on GitHub
official implementation of MGA-CLAP (ACM MM 2024)
☆29Oct 25, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kuan2jiu99 / audio-hallucination
View on GitHub
Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024
☆34Mar 14, 2025Updated last year
xiaoxue1117 / speech-mamba-public
View on GitHub
☆15Nov 26, 2024Updated last year
Labbeti / aac-metrics
View on GitHub
Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
☆75Mar 22, 2026Updated 4 months ago
zeyuxie29 / AudioTime
View on GitHub
☆39Jul 4, 2024Updated 2 years ago
shaokai1209 / MDSA
View on GitHub
[IEEE, TASLP, 2023] The code of the paper "Multi-Source Discriminant Subspace Alignment for Cross-Domain Speech Emotion Recognition".
☆19Sep 27, 2024Updated last year
felixgontier / dcase-2023-baseline
View on GitHub
☆14Mar 25, 2023Updated 3 years ago
Xianchao-Wu / wenet-deep-sparse-conformer
View on GitHub
☆15Aug 25, 2022Updated 3 years ago