lpscr / F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
β43Updated 3 months ago
Alternatives and similar repositories for F5-TTS:
Users that are interested in F5-TTS are comparing it to the libraries listed below
- Running the F5-TTS by ONNX Runtimeβ104Updated last week
- π π€ Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloningβ150Updated 7 months ago
- Convert your PDFs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and efficientβ¦β44Updated last week
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesisβ174Updated this week
- Uses deepgram/whisper/custom models to create an LJSpeech dataset for voice model fine tuningβ25Updated last week
- Misc. tools/scripts that I made to use for tortoiseβ22Updated 6 months ago
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,β¦β66Updated 4 months ago
- 1 min voice data can also be used to train a good TTS model! (few shot voice cloning)β18Updated last month
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ32Updated 3 months ago
- β94Updated 9 months ago
- Turn any common eBook file into an HQ Audiobook with F5-TTS (Easy Install)β19Updated 2 months ago
- Awesome music generation modelββMGΒ²β137Updated 2 weeks ago
- β117Updated 2 months ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restorationβ119Updated last week
- β58Updated 5 months ago
- This project includes a Python script for fine-tuning a text-to-speech (TTS) model. The script utilizes custom datasets and use CUDA for β¦β14Updated 4 months ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.ioβ67Updated last year
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusionβ171Updated 4 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.β57Updated last week
- API for a Vocal Remover that uses Deep Neural Networks.β97Updated 7 months ago
- AI powered speech denoising and enhancement. Adapted for windows and optimizedβ79Updated 7 months ago
- β200Updated 4 months ago
- β38Updated 9 months ago
- VALL-E 2 reproductionβ113Updated 7 months ago
- List of repositories relevant to VITS.β36Updated last year
- Audio datasets, easier.β82Updated last year
- LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesisβ337Updated last week
- β80Updated 7 months ago
- G2Pβ119Updated this week