Vyvo-Labs / SpeechPlusLinks
SpeechPlus: Small LLM-Based Text-to-Speech Library π
β15Updated 6 months ago
Alternatives and similar repositories for SpeechPlus
Users that are interested in SpeechPlus are comparing it to the libraries listed below
Sorting:
- a Neural Vocoder supporting Ring Attention, Conformer and NSF.β23Updated 3 months ago
- β44Updated 4 months ago
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORMβ18Updated last year
- Forced alignment decoder for Whisper.β14Updated last year
- β29Updated last month
- β11Updated last year
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into oneβ26Updated last year
- The Vokan Architecture (Tsukasa speech based)β10Updated 9 months ago
- [EMNLP 2025 Findings] Official code for EZ-VC: Easy Zero-shot Any-to-Any Voice Conversionβ28Updated 2 months ago
- A collection of all our phonemeizers for dataset construction and inferenceβ27Updated 9 months ago
- StyleTTS 2 Optimized Training Forkβ34Updated 9 months ago
- Unofficial implementation of ConvNeXt-TTS powered by lightningβ17Updated last year
- DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factorsβ35Updated 9 months ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.β15Updated 6 months ago
- β14Updated last year
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"β11Updated 7 months ago
- speaker-disentangled speech linguistic content quantizerβ23Updated 8 months ago
- Text-To-Speech for NotebookLMβ35Updated 4 months ago
- Conformer block with Rotary Position Embedding, modified from lucidrains' implementβ16Updated last year
- ProsodyLM: Uncovering the Emerging Prosody Processing Capabilities in Speech Language Modelsβ31Updated last week
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networksβ17Updated 2 years ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription usingβ¦β29Updated 2 years ago
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech β¦β29Updated 3 weeks ago
- An official implementation of Style-Talker for Spoken Dialogue Generationβ23Updated 10 months ago
- text to speechβ10Updated last year
- β19Updated last year
- This repository implement a novel zero-shot TTS framework, named Flamed-TTS, focusing on the efficient generation and dynamic pacing in β¦β55Updated 3 months ago
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoderβ28Updated 3 months ago
- Zero-shot voice cloning text-to-speech (TTS) with explicit emotion class conditioning built on F5-TTSβ23Updated 2 weeks ago
- Pushing the Limits of Zero-shot End-to-End Speech Translationβ26Updated 11 months ago