My implementation of diffusion (like) models
☆11Apr 14, 2023Updated 2 years ago
Alternatives and similar repositories for diffusion
Users that are interested in diffusion are comparing it to the libraries listed below
Sorting:
- ☆19Feb 2, 2023Updated 3 years ago
- text to speech☆10Mar 19, 2024Updated last year
- VI-SVC model is just VITS without MAS and DurationPredictor.☆10Nov 9, 2023Updated 2 years ago
- The source code for the paper XiaoiceSing2 (interspeech2023)☆49Jan 15, 2024Updated 2 years ago
- Project of Singing Voice Conversion.☆16Oct 27, 2023Updated 2 years ago
- unofficial pytorch implementation of HiFi-GAN with fast MISR.☆15Mar 21, 2023Updated 2 years ago
- 4G GPU & 10 Minutes for train☆12Aug 9, 2023Updated 2 years ago
- End-To-End SpeechSynthesis system with knowledge distillation☆18Jul 16, 2022Updated 3 years ago
- Unofficial implementation of ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech☆19Feb 9, 2025Updated last year
- PITS: Variational Pitch Inference for End-to-end Pitch-controllable TTS without External Pitch Predictor☆17Apr 13, 2023Updated 2 years ago
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆18May 17, 2024Updated last year
- Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.☆30Sep 16, 2022Updated 3 years ago
- Sovits5 with RMVPE☆14Jul 17, 2023Updated 2 years ago
- ACT-Bench – We Evaluate Action-Fidelity of World Models for Autonomous Driving☆26Dec 23, 2024Updated last year
- ☆19Mar 22, 2024Updated last year
- 基于vits fastspeech2 visinger的tts模型☆24Mar 9, 2023Updated 2 years ago
- ☆25Jun 4, 2024Updated last year
- A pitch detection model trained to be robust against noise and reverberation environments.☆27Jan 21, 2025Updated last year
- ☆54Jul 16, 2025Updated 7 months ago
- Multi-Task Speech classification of accent and gender of an english speaker on Mozilla's common voice dataset☆27May 30, 2025Updated 9 months ago
- C++ version of pyannote audio speaker diarizaiton pipeline☆22Feb 14, 2024Updated 2 years ago
- BigVGAN with Neural Source-Filter☆56Sep 21, 2023Updated 2 years ago
- Just another FastSpeech 2 but cleaner code :)☆29Jun 28, 2024Updated last year
- ☆26Mar 20, 2024Updated last year
- Streaming Vocos☆30Jun 10, 2025Updated 8 months ago
- Codebase for ICLR' 23 paper- ''wav2tok: Deep Sequence Tokenizer for Audio Retrieval"☆36Feb 10, 2026Updated 2 weeks ago
- Official implementation of "Equivariant Self-Supervision for Musical Tempo Estimation (ISMIR 2022)"☆26Feb 6, 2023Updated 3 years ago
- Export an ONNX graph that performs ISTFT. Designed for TTS models.☆27Apr 23, 2024Updated last year
- E2E TTS using Conditional Flow Matching (Experimental*)☆71Nov 10, 2023Updated 2 years ago
- An High-resolution implementation of HiFi-GAN Vocoder for Voice Conversion.☆32Apr 10, 2023Updated 2 years ago
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆61Apr 4, 2024Updated last year
- An open-source Kazakh Emotional Text-to-Speech Dataset☆35Aug 1, 2025Updated 7 months ago
- ☆68Jul 23, 2023Updated 2 years ago
- [Neurips 2021]Diffusion Normalizing Flow (DiffFlow)☆120Sep 13, 2023Updated 2 years ago
- My vocoder experiments☆31Jul 26, 2025Updated 7 months ago
- ☆26Sep 22, 2022Updated 3 years ago
- Viterbi decoding in PyTorch☆40Sep 10, 2025Updated 5 months ago
- Chinese polyphone disambiguation for Text-to-Speech application☆42Jun 11, 2024Updated last year
- A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation☆36Jan 17, 2024Updated 2 years ago