anhnh2002 / XTTSv2-Finetuning-for-New-Languages
β123Updated 3 months ago
Alternatives and similar repositories for XTTSv2-Finetuning-for-New-Languages:
Users that are interested in XTTSv2-Finetuning-for-New-Languages are comparing it to the libraries listed below
- This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.β69Updated 4 months ago
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusionβ173Updated 5 months ago
- π Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. π§π₯π Advanced audio processing.β239Updated 9 months ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.ioβ68Updated last year
- β207Updated 5 months ago
- Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)β385Updated this week
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representationsβ147Updated last year
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictionsβ242Updated 2 months ago
- Finetune VITS and MMS using HuggingFace's toolsβ137Updated 11 months ago
- β352Updated 6 months ago
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,β¦β67Updated 5 months ago
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"β49Updated 4 months ago
- Your one-stop solution for voice dataset creationβ117Updated last year
- Official Implementation of StyleTTSβ429Updated 2 months ago
- β253Updated last year
- Text to speech alignment using CTC forced alignmentβ233Updated 3 weeks ago
- Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3β395Updated 6 months ago
- Update ASR paper everydayβ168Updated this week
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesisβ232Updated last week
- π π€ Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloningβ154Updated 8 months ago
- FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3β193Updated 11 months ago
- Unofficial implementation of NVIDIA P-Flow TTS paperβ220Updated 2 months ago
- Efficient approach to speaker diarization using voice characteristics extractionβ92Updated 10 months ago
- Official Implementation of StyleTTS-VCβ177Updated 2 months ago
- XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech (INTERSPEECH 2023)β317Updated 7 months ago
- Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorchβ449Updated last week
- Application of MB-iSTFT-VITS components to vits2_pytorchβ124Updated 4 months ago
- Running the F5-TTS by ONNX Runtimeβ123Updated this week
- VoiceBench: Benchmarking LLM-Based Voice Assistantsβ144Updated last week
- F5-TTS ζ¨ηε ιοΌιεΊ¦ζεηΊ¦4εοΌβ59Updated 2 months ago