smtiitm / Fastspeech2_HS
Indic TTS for Indian Languages: This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages, improving quality of synthesis, as well as small foot print TTS integrated with disability aids and various other applications.
☆29Updated last week
Alternatives and similar repositories for Fastspeech2_HS
Users that are interested in Fastspeech2_HS are comparing it to the libraries listed below
Sorting:
- Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR☆53Updated 2 weeks ago
- Text-to-Speech for languages of India☆241Updated 6 months ago
- ☆19Updated last month
- ☆138Updated 5 months ago
- Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.☆306Updated last year
- Finetune VITS and MMS using HuggingFace's tools☆151Updated last year
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆246Updated last month
- Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translation☆155Updated last week
- Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)☆81Updated 11 months ago
- Pretraining, fine-tuning and evaluation scripts for Indic-Wav2Vec2☆86Updated last year
- Indic TTS for Indian Languages: This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages, improving qu…☆14Updated last year
- VoiceBench: Benchmarking LLM-Based Voice Assistants☆196Updated last week
- ☆30Updated last month
- Efficient approach to speaker diarization using voice characteristics extraction☆94Updated last year
- A blueprint for creating Pretraining and Fine-Tuning datasets for Indic languages☆106Updated 7 months ago
- Real-time Speech-Text Foundation Model Toolkit (wip)☆228Updated last month
- ☆46Updated 2 years ago
- ☆287Updated 11 months ago
- Update ASR paper everyday☆208Updated this week
- This repository contains the training, inference, evaluation code for SpeechLLM models and details about the model releases on huggingfac…☆104Updated 10 months ago
- This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.☆77Updated 6 months ago
- ☆359Updated 8 months ago
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at…☆416Updated last month
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆144Updated last year
- ☆156Updated last week
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆83Updated last year
- Translation models for 22 scheduled languages of India☆314Updated 2 weeks ago
- ☆39Updated last year
- Some comprehensive papers about speaker diarization☆277Updated 2 months ago
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion☆178Updated 7 months ago