wsntxxn/TextToAudioGrounding

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/wsntxxn/TextToAudioGrounding)

wsntxxn / TextToAudioGrounding

The dataset and baseline code for Text-to-Audio Grounding (TAG)

☆49

Alternatives and similar repositories for TextToAudioGrounding

Users that are interested in TextToAudioGrounding are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

wsntxxn / AudioCaption
View on GitHub
Audio captioning recipe
☆53Oct 23, 2025Updated 9 months ago
zeyuxie29 / AudioTime
View on GitHub
☆39Jul 4, 2024Updated 2 years ago
Ming-er / Audio-Free-P-Tuning
View on GitHub
☆11Dec 28, 2023Updated 2 years ago
audio-captioning / audio-captioning-resources
View on GitHub
A list of resources that can help in research for automated audio captioning
☆34Feb 17, 2021Updated 5 years ago
XinhaoMei / WavCaps
View on GitHub
This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.
☆264Jul 25, 2024Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
Labbeti / aac-datasets
View on GitHub
Audio Captioning datasets for PyTorch.
☆129Mar 25, 2026Updated 3 months ago
zeyuxie29 / PicoAudio
View on GitHub
☆45Jan 13, 2025Updated last year
Ming-er / LGC-SED
View on GitHub
☆13Jan 3, 2024Updated 2 years ago
Ming-er / MGA-CLAP
View on GitHub
official implementation of MGA-CLAP (ACM MM 2024)
☆29Oct 25, 2024Updated last year
xiaomi-research / dasheng-tokenizer
View on GitHub
State-of-the-art continious audio tokenization
☆40Mar 9, 2026Updated 4 months ago
lysanderism / TimeAudio
View on GitHub
The official repository TimeAudio, a comprehensive framework that incorporates fine-grained acoustic cues into LALMs with enhanced module…
☆30Nov 18, 2025Updated 8 months ago
zeyuxie29 / SemanticVocoder
View on GitHub
☆28Apr 6, 2026Updated 3 months ago
MorenoLaQuatra / ARCH
View on GitHub
ARCH: Audio Representations benCHmark
☆57Aug 26, 2024Updated last year
XinhaoMei / ACT
View on GitHub
Source code for the paper 'Audio Captioning Transformer'
☆56Jan 18, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
FishMaster93 / AFFIA3K
View on GitHub
☆10Apr 12, 2023Updated 3 years ago
JishengBai / AudioSetCaps
View on GitHub
A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline
☆208Dec 13, 2024Updated last year
Soldelli / VLG-Net
View on GitHub
VLG-Net: Video-Language Graph Matching Networks for Video Grounding
☆31May 31, 2022Updated 4 years ago
yanghaha0908 / WavCube
View on GitHub
Official code for "WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling"
☆62Jun 27, 2026Updated 3 weeks ago
microsoft / NoAudioCaptioning
View on GitHub
Repository for "Training Audio Captioning Models without Audio"
☆10Sep 26, 2023Updated 2 years ago
xiquan-li / MeanAudio
View on GitHub
[ACL 2026 Main] MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows
☆142Sep 2, 2025Updated 10 months ago
akoepke / audio-retrieval-benchmark
View on GitHub
Code for "Audio Retrieval with Natural Language Queries: A Benchmark Study", Transactions on Multimedia 2022
☆54Jul 16, 2025Updated last year
KdaiP / DC-Speech-VAE
View on GitHub
5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs
☆57Nov 19, 2025Updated 8 months ago
JinhuaLiang / lam4fsl
View on GitHub
An official repo for the paper "Adapting Language-Audio Models as Few-Shot Audio Learners"
☆31May 31, 2023Updated 3 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
turpaultn / DESED
View on GitHub
Repo associated to the DESED dataset, download and creation of data
☆154Jul 16, 2024Updated 2 years ago
sony / CLIPSep
View on GitHub
☆43Feb 21, 2023Updated 3 years ago
audio-captioning / caption-evaluation-tools
View on GitHub
Tools for the evaluation of audio captioning.
☆19May 23, 2020Updated 6 years ago
xiquan-li / Awesome-Audio-Generation
View on GitHub
Curated list for papers, codes and resources related to Text-to-Audio (TTA) Generation
☆74Updated this week
wsntxxn / UniFlow-Audio
View on GitHub
☆73Jul 17, 2026Updated last week
XinhaoMei / audio-text_retrieval
View on GitHub
Implementation of our paper 'On Metric Learning For Audio-Text Cross-Modal Retrieval'
☆51May 17, 2022Updated 4 years ago
audio-captioning / audio-captioning-papers
View on GitHub
A list of papers about audio captioning
☆78Jul 1, 2022Updated 4 years ago
HumBug-Mosquito / HumBugDB
View on GitHub
Acoustic mosquito detection code with Bayesian Neural Networks
☆62Oct 4, 2021Updated 4 years ago
PigeonDan1 / paper_claw
View on GitHub
Paper Claw sends personalized daily research digests from arXiv and beyond straight to your inbox, featuring customizable categories, int…
☆32Updated this week
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
tqbl / ood_audio
View on GitHub
An audio classification system for learning with out-of-distribution data
☆33Dec 8, 2022Updated 3 years ago
liuxubo717 / V-ACT
View on GitHub
Visually-Aware Audio Captioning
☆43Mar 3, 2023Updated 3 years ago
liuxubo717 / sound_generation
View on GitHub
Code and generated sounds for "Conditional Sound Generation Using Neural Discrete Time-Frequency Representation Learning", MLSP 2021
☆69Sep 3, 2021Updated 4 years ago
liuxubo717 / LASS
View on GitHub
This repo hosts the code and model of "Separate What You Describe: Language-Queried Audio Source Separation", Interspeech 2022
☆146Oct 11, 2023Updated 2 years ago
etzinis / heterogeneous_separation
View on GitHub
Code and data recipes for the paper: Heterogeneous Target Speech Separation
☆44Dec 6, 2022Updated 3 years ago
diggerdu / AudioMamba
View on GitHub
☆12Jun 1, 2024Updated 2 years ago
NieeiM / Dasheng-Audiogen
View on GitHub
Generate a complete audio clip with music, intelligible speech, and sound effects from text in one pass.
☆44May 27, 2026Updated last month