unixpickle / vq-voice-swapView external linksLinks
Voice swapping with VQ-VAE and diffusion models
☆68Nov 8, 2021Updated 4 years ago
Alternatives and similar repositories for vq-voice-swap
Users that are interested in vq-voice-swap are comparing it to the libraries listed below
Sorting:
- ☆11Nov 7, 2024Updated last year
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆30May 27, 2023Updated 2 years ago
- ☆14Aug 1, 2025Updated 6 months ago
- Non Parallel Voice Conversion based on VITS☆24Mar 31, 2023Updated 2 years ago
- Clean and modernized implementation of FastSpeech2/LightSpeech using IPA☆18Aug 16, 2024Updated last year
- SpeechPlus: Small LLM-Based Text-to-Speech Library 🚀☆20May 20, 2025Updated 8 months ago
- A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis☆44Jul 24, 2023Updated 2 years ago
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆16Sep 13, 2024Updated last year
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- ☆16Dec 12, 2023Updated 2 years ago
- StyleTTS 2 Optimized Training Fork☆33Feb 2, 2025Updated last year
- ☆32Jul 27, 2022Updated 3 years ago
- An AR+AR TTS attempt.☆18Jan 13, 2025Updated last year
- Sequence alignement methods with helpers for PyTorch.☆24Nov 30, 2022Updated 3 years ago
- Real-time end-to-end singing voice convertion☆23Nov 3, 2024Updated last year
- A port of muP to JAX/Haiku☆25Oct 23, 2022Updated 3 years ago
- ☆46Apr 16, 2023Updated 2 years ago
- Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report☆49Sep 2, 2025Updated 5 months ago
- Hybrid speech synthesiser☆28Feb 18, 2019Updated 6 years ago
- ☆24Sep 27, 2022Updated 3 years ago
- 44100Hz日本語音源に対応させた unofficial vits2-TTS implementation in pytorchです。☆24Sep 1, 2023Updated 2 years ago
- ☆22Jun 8, 2021Updated 4 years ago
- ☆54Jul 16, 2025Updated 7 months ago
- Streaming Vocos☆29Jun 10, 2025Updated 8 months ago
- Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing☆71Dec 2, 2022Updated 3 years ago
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Mar 24, 2023Updated 2 years ago
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆13Dec 4, 2024Updated last year
- Submit tweets for https://twitter.com/SemanticRelease using pull requests☆13Aug 6, 2024Updated last year
- [INTERSPEECH 2024] Official code for VoxSim: A perceptual voice similarity dataset☆12Sep 29, 2025Updated 4 months ago
- ☆23Nov 3, 2025Updated 3 months ago
- Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE☆15Nov 30, 2022Updated 3 years ago
- ☆13Nov 22, 2022Updated 3 years ago
- text to speech☆10Mar 19, 2024Updated last year
- Onset-and-Offset-Aware Sound Event Detection☆20Feb 10, 2025Updated last year
- Arabic Grapheme-to-Phoneme (G2P) Conversion☆13Mar 15, 2025Updated 11 months ago
- ☆28Nov 15, 2023Updated 2 years ago
- ☆26Mar 25, 2023Updated 2 years ago
- ☆26Mar 20, 2024Updated last year
- Torch implementation of NANSY, Neural Analysis and Synthesis, arXiv:2110.14513☆64Feb 13, 2023Updated 3 years ago