YAIxPOZAlabs / Improving-TrXL-for-ComMULinks
YAI 11 x @POZAlabs : Improving & Evaluating Music Generation with ComMU
☆13Updated 2 years ago
Alternatives and similar repositories for Improving-TrXL-for-ComMU
Users that are interested in Improving-TrXL-for-ComMU are comparing it to the libraries listed below
Sorting:
- ☆27Updated 2 years ago
- Code for Novel View Acoustic Synthesis paper☆48Updated last year
- YAI 11 x @POZAlabs : Music generation & modification from Unclear midi SEquence with Diffusion model☆27Updated last year
- to release the source code for reproducing the results reported in our paper: https://arxiv.org/abs/2409.17550☆12Updated 8 months ago
- A simple library for extracting representations from Jukebox☆34Updated 2 years ago
- Download scripts and tools for Replay dataset.☆33Updated 2 years ago
- Textless Speech-to-Music Retrieval Using Emotion Similarity [ICASSP23]☆17Updated last year
- Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis☆39Updated last year
- [ICLR 2025] NeRAF jointly learns acoustic and radiance fields, enabling realistic audio-visual generation.☆21Updated 2 months ago
- Code for "SelfTalk: A Self-Supervised Commutative Training Diagram to Comprehend 3D Talking Faces" ACM MM 2023☆30Updated last year
- ☆46Updated last year
- Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness (ICASSP 202…☆70Updated last year
- [Neurips'24 Spotlight] Official code for "Acoustic Volume Rendering for Neural Impulse Response Fields"☆39Updated 6 months ago
- Code release for PianoMotion10M☆86Updated 3 months ago
- Official PyTorch implementation of the paper "A Brand New Dance Partner:Music-Conditioned Pluralistic Dancing Synthesized by Multiple Dan…☆36Updated 3 years ago
- NeMo: a toolkit for conversational AI☆9Updated 2 weeks ago
- Official repository for the paper Multimodal Transformer Distillation for Audio-Visual Synchronization (ICASSP 2024).☆25Updated last year
- Implementation of Multi-Source Music Generation with Latent Diffusion.☆24Updated 10 months ago
- We present a model that can generate accurate 3D sound fields of human bodies from headset microphones and body pose as inputs.☆87Updated last year
- BEGANSing - Korean SVS + SVC + AudioSR☆11Updated last year
- ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer☆34Updated 6 months ago
- ☆47Updated 11 months ago
- ☆15Updated last year
- Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation☆39Updated last year
- Implementation for the paper "Can Language Models Learn to Listen?"☆65Updated last year
- Art2Mus is a system that generates music based on digitized artworks and text by using the AudioLDM2 architecture with an added projectio…☆16Updated 7 months ago
- ☆8Updated 11 months ago
- Flow Matching implemented in PyTorch☆39Updated 6 months ago
- [INTERSPEECH'24] Official repository for "Enhancing Speech-Driven 3D Facial Animation with Audio-Visual Guidance from Lip Reading Expert"☆16Updated 3 weeks ago
- Official source codes of airsep☆36Updated last year