smtiitm / Fastspeech2_HS
Indic TTS for Indian Languages: This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages, improving quality of synthesis, as well as small foot print TTS integrated with disability aids and various other applications.
☆28Updated 3 weeks ago
Alternatives and similar repositories for Fastspeech2_HS:
Users that are interested in Fastspeech2_HS are comparing it to the libraries listed below
- Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR☆50Updated 10 months ago
- Text-to-Speech for languages of India☆230Updated 5 months ago
- Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.☆302Updated last year
- Indic TTS for Indian Languages: This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages, improving qu…☆14Updated last year
- ☆130Updated 4 months ago
- ☆356Updated 7 months ago
- Finetune VITS and MMS using HuggingFace's tools☆145Updated last year
- NPTEL2020: Speech2Text dataset for Indian-English Accent☆75Updated 3 years ago
- ☆39Updated last year
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆240Updated last month
- ☆284Updated 10 months ago
- ☆14Updated 3 weeks ago
- Pretraining, fine-tuning and evaluation scripts for Indic-Wav2Vec2☆85Updated last year
- Efficient approach to speaker diarization using voice characteristics extraction☆91Updated last year
- ☆26Updated 3 weeks ago
- A blueprint for creating Pretraining and Fine-Tuning datasets for Indic languages☆106Updated 6 months ago
- Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)☆77Updated 10 months ago
- ☆43Updated 2 years ago
- This repository contains the training, inference, evaluation code for SpeechLLM models and details about the model releases on huggingfac…☆99Updated 10 months ago
- ☆46Updated 2 years ago
- ☆269Updated 10 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆81Updated last year
- Update ASR paper everyday☆196Updated this week
- A python package for whisper normalizer☆55Updated this week
- This project is about performing Speaker diarization for Hindi Language.☆49Updated 4 years ago
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at…☆411Updated 3 weeks ago
- [Interspeech 2024] Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation☆150Updated 2 months ago
- On-device voice activity detection (VAD) powered by deep learning☆206Updated last week
- Real-time Speech-Text Foundation Model Toolkit (wip)☆224Updated 3 weeks ago
- Various speech datasets made available to the public☆116Updated 4 months ago