prompteus/audio-captioning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/prompteus/audio-captioning)

prompteus / audio-captioning

Audio captioning - DCASE challenge 2023 task 6a

☆30

Alternatives and similar repositories for audio-captioning

Users that are interested in audio-captioning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

georgid / AlignmentEvaluation
View on GitHub
Scripts for computing common lyrics-to-audio alignment evaluation metrics. Usable evaluation for any token-based alignment (e.g. if tok…
☆18Oct 27, 2020Updated 5 years ago
sigmedia / sp1ny
View on GitHub
☆10Aug 29, 2024Updated last year
schufo / plla-tisvs
View on GitHub
Phoneme Level Lyrics Alignment and Text-Informed Singing Voice Separation
☆24Nov 8, 2021Updated 4 years ago
zerospeech / zerospeech2021
View on GitHub
Zerospeech Challenge 2021: validation and evaluation software
☆12Jun 13, 2022Updated 4 years ago
GasserElbanna / serab-byols
View on GitHub
(Hybrid) BYOL-S feature extractor using serab-byols package in pytorch.
☆27Apr 20, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
rishabhjain16 / whisper_child_asr
View on GitHub
☆12May 23, 2023Updated 3 years ago
jihoojung0106 / open-singsong
View on GitHub
Open SingSong - Implementation of 'SingSong: Generating Musical Accompaniments from Singing' by Google Research, with a few modifications
☆17Jun 10, 2024Updated 2 years ago
Miffyli / asv-cm-reinforce
View on GitHub
Optimizing speaker verification and spoofing countermeasure systems together with REINFORCE
☆13Mar 31, 2021Updated 5 years ago
groadabike / Kaldi-Dsing-task
View on GitHub
DSing ASR task: Resources and Baseline for an unaccompanied singing ASR.
☆19Jul 9, 2026Updated last week
palle-k / tsne-pytorch
View on GitHub
CUDA-accelerated PyTorch implementation of t-SNE
☆25May 15, 2021Updated 5 years ago
chitralekha18 / AutomaticSungLyricsAnnotation_ISMIR2018
View on GitHub
☆22Sep 26, 2022Updated 3 years ago
neverix / musicgen_trainer
View on GitHub
simple trainer for musicgen/audiocraft
☆15Jul 14, 2023Updated 3 years ago
teo-sl / Audio-Super-Resolution-ViT
View on GitHub
This repository contains the source code for the implementation of two deep learning models concerning the audio super resolution task.
☆14Mar 14, 2023Updated 3 years ago
BiSinger-SVS / BiSinger
View on GitHub
Bilingual Singing Voice Synthesis
☆18Mar 25, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
jhwanflow / Fastspeech2-Korean
View on GitHub
☆14May 14, 2021Updated 5 years ago
iamxiaoyubei / Voice-Tech-Study
View on GitHub
语音识别语音前端处理语音合成语音转换等等语音技术的资料汇总
☆23Nov 8, 2019Updated 6 years ago
gzhu06 / Y-vector
View on GitHub
Y-vector: Multiscale Waveform Encoder for Speaker Embedding
☆24Jul 16, 2024Updated 2 years ago
koreanAI / 2023-Korean-AI-Competition
View on GitHub
2023 한국어 AI 경진대회
☆10Oct 30, 2023Updated 2 years ago
microsoft / Pengi
View on GitHub
An Audio Language model for Audio Tasks
☆322Apr 19, 2024Updated 2 years ago
Labbeti / conette-audio-captioning
View on GitHub
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
☆23Dec 17, 2025Updated 7 months ago
lampts / chatgpt-mle-interview
View on GitHub
ChatGPT solutions for the MLE interview
☆14Dec 9, 2022Updated 3 years ago
bryan051003 / USVG
View on GitHub
A unified model for zero-shot singing voice conversion and synthesis
☆22Nov 30, 2022Updated 3 years ago
CODEJIN / DiffSingerKR
View on GitHub
☆25Aug 31, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
fschmid56 / EfficientAT_HEAR
View on GitHub
Evaluate EfficientAT models on the Holistic Evaluation of Audio Representations Benchmark.
☆34Jun 23, 2023Updated 3 years ago
SamLynnEvans / LSTM_with_attention
View on GitHub
Seq2seq using LSTM with attention from Luong et al
☆10Oct 2, 2018Updated 7 years ago
gianfelton / RFM-Segmentation-with-Quartiles-Jenks-Natural-Breaks-and-HDBSCAN
View on GitHub
☆10Jul 12, 2019Updated 7 years ago
rasbt / datapipes-blog
View on GitHub
Code for the DataPipes article
☆15Jun 14, 2022Updated 4 years ago
p1an-lin-jung / WavThruVec_pytorch
View on GitHub
An implementation of Charactr, Inc's "WavThruVec: Latent speech representation as intermediate features for neural speech synthesis"
☆29Sep 6, 2023Updated 2 years ago
cardoso / AutoPong
View on GitHub
My WWDC17 scholarship winning playground
☆13Feb 14, 2019Updated 7 years ago
tangloner / ssmonetdb
View on GitHub
sqlsmith driver for monetdb
☆10Mar 31, 2017Updated 9 years ago
lballore / streamlit-nginx
View on GitHub
streamlit-nginx is a Docker image with Streamlit and Nginx for web-based demo applications in Python 3.6 and above, in a single container…
☆14Apr 28, 2021Updated 5 years ago
SungFeng-Huang / Meta-TTS
View on GitHub
Official repository of https://doi.org/10.1109/TASLP.2022.3167258. More up-to-date code is in "refactor" branch.
☆192Jun 8, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
imdreamrunner / python-jyutping
View on GitHub
Python 汉字到粤拼转换工具。
☆35Feb 26, 2024Updated 2 years ago
sylvchev / riemannianPCA
View on GitHub
Dimensionality reduction on manifold of SPD matrices, based on pymanopt implementation
☆11Apr 18, 2023Updated 3 years ago
vgbench / VGBench
View on GitHub
☆19Sep 19, 2024Updated last year
facebookarchive / rf-coverage-maps
View on GitHub
This project will provide code that reads geospatial RF coverage data stored in CSV format. It will parse out relevant fields (latitude,…
☆13Sep 17, 2021Updated 4 years ago
AmandineBtto / Batvision-Dataset
View on GitHub
A large-scale real-world audio-visual dataset for research on 3D scene understanding and echolocation.
☆22Oct 21, 2025Updated 9 months ago
taichi-dev / advanced_examples
View on GitHub
More advanced Taichi examples
☆13Jun 16, 2021Updated 5 years ago
Lab41 / Misc
View on GitHub
Miscellaneous utility functions
☆11Nov 17, 2016Updated 9 years ago