YoonjinXD/T-FOLEY

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/YoonjinXD/T-FOLEY)

YoonjinXD / T-FOLEY

Implementation of the paper, T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis, accepted in 2024 ICASSP

☆34

Alternatives and similar repositories for T-FOLEY

Users that are interested in T-FOLEY are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

DCASE2024-Task7-Sound-Scene-Synthesis / AudioLDM-training-finetuning
View on GitHub
AudioLDM training, finetuning, evaluation and inference.
☆14Mar 27, 2024Updated 2 years ago
PapayaResearch / ctag
View on GitHub
[ICML'24] Creative Text-to-Audio Generation via Synthesizer Programming
☆41Sep 26, 2024Updated last year
yuhanghe01 / RiTTA
View on GitHub
Event Relation in Text-to-Audio (TTA) Generation
☆21Feb 26, 2025Updated last year
ismir-24-sub / unsupervised_compositional_representations
View on GitHub
ISMIR 24 Supplementary Material
☆14Oct 28, 2024Updated last year
YoonjinXD / kadtk
View on GitHub
A standardized toolkit of Kernel Audio Distance (KAD)—a distribution-free, unbiased, and computationally efficient metric for evaluating …
☆104Jun 12, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
mdx-workshop / mdx-submissions21
View on GitHub
Music Demixing Challenge Submission Repo
☆16Sep 8, 2023Updated 2 years ago
liuhuadai / AudioLCM
View on GitHub
PyTorch Implementation of [AudioLCM]: a efficient and high-quality text-to-audio generation with latent consistency model.
☆13Jun 15, 2024Updated 2 years ago
kaistmm / fregrad
View on GitHub
[ICASSP 2024] Official code for FreGrad
☆35May 13, 2024Updated 2 years ago
AMAAI-Lab / JamendoMaxCaps
View on GitHub
JamendoMaxCaps is a large-scale dataset of 362,000 instrumental creative commons tracks
☆53May 24, 2025Updated last year
naver-ai / rewas
View on GitHub
Official PyTorch implementation of ReWaS (AAAI'25) "Read, Watch and Scream! Sound Generation from Text and Video"
☆44Dec 13, 2024Updated last year
fundwotsai2001 / Text-to-music-dataset-preparation
View on GitHub
A repo that builds text to music datasets from scratch, used in MuseContorlLite [ICML2025]
☆28May 20, 2025Updated last year
mcomunita / syncfusion
View on GitHub
SyncFusion: Multimodal Onset-synchronized Video-to-Audio Foley Synthesis
☆19Jul 22, 2024Updated 2 years ago
NVIDIA / audio-intelligence
View on GitHub
Elucidated Text-To-Audio (ETTA) is a SOTA text-to-audio model with a holistic understanding of the design space and trained with syntheti…
☆137Mar 3, 2026Updated 4 months ago
cyanbx / Frieren-V2A
View on GitHub
Implementation of Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching (NeurIPS'24)
☆62Apr 3, 2025Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
TeeJayBaker / PolyDDSP
View on GitHub
Polyphonic generalisation of DDSP
☆22Apr 30, 2024Updated 2 years ago
asdfang / constraint-transformer-bach
View on GitHub
Transformer with constraints on Bach chorales
☆11Aug 14, 2020Updated 5 years ago
IDEA-Emdoor-Lab / DistilCodec
View on GitHub
A Neural Audio Codec (NAC) for Universal Audio
☆47May 30, 2025Updated last year
walker-hyf / ECSS
View on GitHub
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)
☆59Jun 20, 2024Updated 2 years ago
sarulab-speech / Coco-Nut
View on GitHub
Coco-Nut (Corpus of connecting NIHONGO utterance and text) corpus
☆21Jun 12, 2024Updated 2 years ago
yoongi43 / music_audio_enhancement_conformer
View on GitHub
Implementation of the paper "Exploiting Time-Frequency Conformers for Music Audio Enhancement"
☆14Mar 21, 2025Updated last year
gladia-research-group / latent-autoregressive-source-separation
View on GitHub
☆18Apr 28, 2023Updated 3 years ago
sh-lee97 / grafx-prune
View on GitHub
Searching for Music Mixing Graphs: A Pruning Approach
☆26Feb 13, 2025Updated last year
yzGuu830 / efficient-speech-codec
View on GitHub
[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers
☆126Mar 20, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
slSeanWU / beats-conformer-bart-audio-captioner
View on GitHub
PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…
☆41Jan 6, 2024Updated 2 years ago
seungheondoh / music-text-representation-pp
View on GitHub
Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval (TTMR++) [ICASSP24]
☆43Oct 7, 2024Updated last year
Badr-MOUFAD / mgdm
View on GitHub
Code for MGDM algorithm, ICML 2025, https://arxiv.org/abs/2502.03332
☆16May 19, 2025Updated last year
mzsun01 / MM-LDM
View on GitHub
☆11Apr 12, 2024Updated 2 years ago
lavendery / AudioComposer
View on GitHub
☆27Sep 10, 2025Updated 10 months ago
ldzhangyx / awesome-mir-labs
View on GitHub
A full collection of Music Informatic Retrieval (MIR) and AI Music labs.
☆53Dec 27, 2024Updated last year
jorshi / drumblender
View on GitHub
Synthesis of percussion sounds using sinusoidal modelling, DDSP noise synthesis, and a neural source filter approach.
☆35Jan 7, 2025Updated last year
Yip-Jia-Qi / codecformer
View on GitHub
☆21Jul 15, 2024Updated 2 years ago
jishengpeng / TextrolSpeech
View on GitHub
[ICASSP 2024] TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models
☆187Nov 22, 2024Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
seungheondoh / speech-to-music
View on GitHub
Textless Speech-to-Music Retrieval Using Emotion Similarity [ICASSP23]
☆17Aug 16, 2023Updated 2 years ago
CPJKU / cpjku_dcase24
View on GitHub
☆29Oct 17, 2024Updated last year
SonyCSLParis / Stem-JEPA
View on GitHub
Joint Embedding Predictive Architecture for Musical Stem Compatibility Estimation
☆55Aug 6, 2024Updated last year
aurianworld / matpac
View on GitHub
Official MATPAC implementation and trained model's weights
☆38Jun 2, 2026Updated last month
yangdongchao / LLM-Codec
View on GitHub
The open source code for LLM-Codec
☆147Aug 18, 2024Updated last year
adrianbarahona / noisebandnet
View on GitHub
Code for the "NoiseBandNet: Controllable Time-Varying Neural Synthesis of Sound Effects Using Filterbanks" paper.
☆39Jul 8, 2024Updated 2 years ago
AMAAI-Lab / mustango
View on GitHub
Mustango: Toward Controllable Text-to-Music Generation
☆394Jun 2, 2025Updated last year