jaeyeonkim99 / EnCLAP
Official Implementation of EnCLAP (ICASSP 2024)
☆88Updated 3 months ago
Related projects: ⓘ
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆81Updated last month
- ☆99Updated this week
- Evaluation Protocol for Large-Scale Zero-Shot TTS Literature☆44Updated last month
- VoiceLDM: Text-to-Speech with Environmental Context☆157Updated last month
- PAM is a no-reference audio quality metric for audio generation tasks☆42Updated 2 months ago
- LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning☆108Updated 3 months ago
- ☆37Updated 3 months ago
- Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.☆112Updated 3 weeks ago
- The open source code for SimpleSpeech series☆85Updated last month
- Unofficial download repository for MusicCaps☆41Updated last year
- ☆50Updated last year
- ICASSP 2023 Accepted☆189Updated 4 months ago
- Official implementation for the paper: A Unified One-Shot Prosody and Speaker Conversion System with Self-Supervised Discrete Speech Unit…☆73Updated last year
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆127Updated last year
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆50Updated 10 months ago
- ☆60Updated last year
- The open source code for LLM-Codec☆106Updated last month
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆111Updated 6 months ago
- The official Implementation of PeriodWave and PeriodWave-Turbo☆107Updated last month
- Code for paper A3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing☆85Updated last week
- An official implementation of "UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data"☆131Updated last year
- Expressive Anechoic Recordings of Speech (EARS)☆123Updated 2 months ago
- Unofficial pytorch implementation of BigVGAN: A Universal Neural Vocoder with Large-Scale Training☆130Updated last year
- Unofficial implementation of NANSY++ in Pytorch Lightning☆46Updated 6 months ago
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability☆79Updated 2 months ago
- NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis☆139Updated last year
- Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model☆75Updated 2 weeks ago
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆41Updated last week
- Robust Singing Voice Transcription and MIDI Extraction☆47Updated last month
- [InterSpeech 24] FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter☆70Updated 2 months ago