Tools for the evaluation of audio captioning.
☆19May 23, 2020Updated 5 years ago
Alternatives and similar repositories for caption-evaluation-tools
Users that are interested in caption-evaluation-tools are comparing it to the libraries listed below
Sorting:
- Code for CVSSP submission to DCASE 2021 Task 6☆36Nov 22, 2022Updated 3 years ago
- ☆14Mar 25, 2023Updated 2 years ago
- Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.☆69Jul 19, 2025Updated 8 months ago
- Baseline system for Language-based Audio Retrieval (Task 6B) in DCASE 2023 Challenge☆10Aug 8, 2023Updated 2 years ago
- Code for the paper: MACE: Leveraging Audio for Evaluating Audio Captioning Systems☆13Jan 16, 2025Updated last year
- A list of resources that can help in research for automated audio captioning☆34Feb 17, 2021Updated 5 years ago
- Python code for handling the Clotho dataset.☆85Nov 24, 2020Updated 5 years ago
- ☆51Apr 13, 2025Updated 11 months ago
- Dataset and baseline for the first Audiocaption task☆79Jul 25, 2024Updated last year
- Explaining audio differences using language☆16Feb 11, 2025Updated last year
- ☆16Aug 10, 2025Updated 7 months ago
- ☆19Jul 22, 2025Updated 7 months ago
- small audio language model for reasoning☆86Dec 4, 2025Updated 3 months ago
- The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025) (Oral)☆33Feb 11, 2026Updated last month
- The code and weight for LoVA. LoVA is a novel model for Long-form Video-to-Audio generation. Based on the Diffusion Transformer (DiT) arc…☆15Feb 27, 2025Updated last year
- Code for phase recovery in MadTwinNet for monaural singing voice separation☆12Jul 17, 2018Updated 7 years ago
- Fine-tune Stable Audio Open with DiT ControlNet.☆249May 16, 2025Updated 10 months ago
- The dataset and baseline code for Text-to-Audio Grounding (TAG)☆50Oct 23, 2025Updated 4 months ago
- ☆12Nov 7, 2024Updated last year
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆49Jan 19, 2026Updated 2 months ago
- Spectral RNNs with adaptive window learning in TensorFlow, ICANN 2020.☆10Sep 20, 2021Updated 4 years ago
- text to speech☆10Mar 19, 2024Updated 2 years ago
- This repo contains the code to reproduce the paper: "Enriched Music Representations with Multiple Cross-modal Contrastive Learning"☆15Jun 22, 2023Updated 2 years ago
- Audio captioning baseline system for DCASE 2020 challenge.☆38Aug 22, 2023Updated 2 years ago
- [ICLR'25] Official repository for "AVHBench: A Cross-Modal Hallucination Evaluation for Audio-Visual Large Language Models"☆20Mar 8, 2026Updated last week
- Sequence alignement methods with helpers for PyTorch.☆24Nov 30, 2022Updated 3 years ago
- Unsupervised Domain Adaptation for Acoustic Scene Classification with Wasserstein Distance☆14Sep 16, 2020Updated 5 years ago
- ☆16Sep 29, 2025Updated 5 months ago
- Code repository for GCT634 Musical Applications of Machine Learning (Spring 2024)☆11May 19, 2024Updated last year
- Automatically Generated d2l-zh TensorFlow Notebooks for Colab☆13Aug 18, 2023Updated 2 years ago
- This package aims at simplifying the download of the AudioCaps dataset.☆36Dec 1, 2023Updated 2 years ago
- "Enemy Spotted: In-game Gun Sound Dataset for Gunshot Classification and Localization", accepted at IEEE Conference on Games (GoG) 2022☆21Sep 6, 2024Updated last year
- A new metric for evaluating end-to-end speech recognition and disfluency removal systems☆19Mar 7, 2021Updated 5 years ago
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models☆200May 29, 2024Updated last year
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆13Dec 4, 2024Updated last year
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated last year
- ☆15Nov 11, 2024Updated last year
- Configuration Space Exploration Framework☆17Oct 13, 2020Updated 5 years ago