lpscr / F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
β48Updated 3 months ago
Alternatives and similar repositories for F5-TTS:
Users that are interested in F5-TTS are comparing it to the libraries listed below
- π π€ Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloningβ154Updated 7 months ago
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.β29Updated this week
- 1 min voice data can also be used to train a good TTS model! (few shot voice cloning)β18Updated 2 months ago
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,β¦β67Updated 5 months ago
- Awesome music generation modelββMGΒ²β141Updated last month
- β122Updated 3 months ago
- β95Updated 10 months ago
- β58Updated 5 months ago
- Running the F5-TTS by ONNX Runtimeβ115Updated last week
- Turn any common eBook file into an HQ Audiobook with F5-TTS (Easy Install)β20Updated 2 months ago
- Convert your PDFs and EPUBs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, andβ¦β51Updated this week
- Advanced RVC Inference for quicker and effortless model downloadsβ45Updated 2 weeks ago
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesisβ222Updated this week
- AI powered speech denoising and enhancement. Adapted for windows and optimizedβ81Updated 8 months ago
- Misc. tools/scripts that I made to use for tortoiseβ22Updated 6 months ago
- β206Updated 5 months ago
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusionβ173Updated 5 months ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ33Updated 4 months ago
- π Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. π§π₯π Advanced audio processing.β236Updated 9 months ago
- This project includes a Python script for fine-tuning a text-to-speech (TTS) model. The script utilizes custom datasets and use CUDA for β¦β14Updated 5 months ago
- API for a Vocal Remover that uses Deep Neural Networks.β99Updated 8 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.β60Updated last week
- β39Updated 10 months ago
- Text-to-Music Generation with Rectified Flow Transformerβ60Updated 6 months ago
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusionβ10Updated 5 months ago