smtiitm / Fastspeech2_HS
Indic TTS for Indian Languages: This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages, improving quality of synthesis, as well as small foot print TTS integrated with disability aids and various other applications.
☆20Updated this week
Alternatives and similar repositories for Fastspeech2_HS:
Users that are interested in Fastspeech2_HS are comparing it to the libraries listed below
- Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR☆46Updated 7 months ago
- Finetune VITS and MMS using HuggingFace's tools☆134Updated 10 months ago
- Text-to-Speech for languages of India☆208Updated 3 months ago
- Indic TTS for Indian Languages: This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages, improving qu…☆15Updated last year
- Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.☆286Updated last year
- ☆18Updated last month
- Fine-Tune Whisper with Transformers and PEFT☆50Updated last year
- ☆265Updated 8 months ago
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion☆171Updated 4 months ago
- ☆43Updated 2 years ago
- Pretraining, fine-tuning and evaluation scripts for Indic-Wav2Vec2☆82Updated 11 months ago
- ☆345Updated 5 months ago
- ☆39Updated last year
- This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.☆68Updated 3 months ago
- ☆114Updated 2 months ago
- Timething is a library for aligning text transcripts with their audio recordings.☆113Updated 2 months ago
- ☆71Updated last year
- Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)☆64Updated 8 months ago
- a curated list of speech datasets (110+ datasets, 75+ easy to download)☆122Updated 2 years ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆107Updated 2 years ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆80Updated last year
- ☆273Updated 8 months ago
- Create an LJSpeech structured voice dataset on wave input☆25Updated 4 months ago
- A non-native English corpus for pronunciation scoring task☆123Updated 7 months ago
- A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS☆34Updated 2 months ago
- This repository contains the training, inference, evaluation code for SpeechLLM models and details about the model releases on huggingfac…☆84Updated 7 months ago
- Update ASR paper everyday☆128Updated this week
- On-device speaker diarization powered by deep learning☆37Updated this week
- A subset of the popular LibriTTS dataset with subsets for English, Scottish, Welsh, and Irish accents.☆14Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆93Updated 4 months ago