XZWY / MSLDM
Implementation of Multi-Source Music Generation with Latent Diffusion.
β22Updated 5 months ago
Alternatives and similar repositories for MSLDM:
Users that are interested in MSLDM are comparing it to the libraries listed below
- β37Updated 8 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.β76Updated last month
- [Official Implementation] Acoustic Autoregressive Modeling π₯β62Updated 5 months ago
- Code for the paper "FLowHigh: Towards efficient and high-quality audio super-resolution with single-step flow matching"β53Updated 3 weeks ago
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986β40Updated 4 months ago
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistencyβ50Updated 3 months ago
- Codebase and project page for EDMSoundβ34Updated last year
- Implementation of the paper, T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis, acβ¦β28Updated 8 months ago
- β25Updated 6 months ago
- Implementation of Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching (NeurIPS'24)β27Updated 3 months ago
- Contains the code associated with the ICLR submission for our text-to-speech diffusion modelβ51Updated last year
- β43Updated 8 months ago
- Findings of ACL 2023 | AlignSTS: a speech-to-singing (STS) model based on modality disentanglement and cross-modal alignmentβ66Updated 7 months ago
- VAE modified from Descript Audio Codec, which replaces the RVQ with VAEβ63Updated 10 months ago
- β72Updated 2 months ago
- Official repo of ICASSP 2024 paper - Generative De-Quantization for Neural Speech Codec via Latent Diffusion.β49Updated last month
- " Music Style Transfer with Time-Varying Inversion of Diffusion Models"β39Updated 6 months ago
- β58Updated 3 months ago
- Test code disclosure for the research paper "UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model", as a supplementaβ¦β19Updated last year
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.β32Updated last year
- The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"β32Updated last year
- official code for CVPR'24 paper Diff-BGMβ55Updated 4 months ago
- [ICASSP 2025] FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversionβ46Updated last week
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applicationsβ78Updated last month
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervβ¦β36Updated last year
- SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)β64Updated 3 weeks ago
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.β44Updated 5 months ago
- AudioSR-Upsampling (any -> 48kHz)β38Updated last year
- [ACMMM'2024] Generative Expressive Conversational Speech Synthesisβ31Updated 3 months ago