AI4Bharat / IndicF5
☆19Updated last month
Alternatives and similar repositories for IndicF5
Users that are interested in IndicF5 are comparing it to the libraries listed below
Sorting:
- ☆138Updated 5 months ago
- Finetune VITS and MMS using HuggingFace's tools☆151Updated last year
- Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR☆53Updated 2 weeks ago
- ☆43Updated 2 years ago
- Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translation☆155Updated last week
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.☆169Updated last year
- Indic TTS for Indian Languages: This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages, improving qu…☆29Updated last week
- This project is about performing Speaker diarization for Hindi Language.☆49Updated 4 years ago
- A python package for whisper normalizer☆59Updated last week
- Towards Building Text-To-Speech Systems for the Next Billion Users - Microsoft Research Intern Work - Accepted at ICASSP 2023☆54Updated 2 years ago
- This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.☆77Updated 6 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆83Updated last year
- A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS☆40Updated 5 months ago
- Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO☆63Updated 2 years ago
- Speaker change detection using SincNet and an LSTM/Transformer☆51Updated 10 months ago
- ☆30Updated last month
- Various speech datasets made available to the public☆118Updated 5 months ago
- Official Repository For VoxBlink2☆67Updated 9 months ago
- ☆287Updated 11 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆149Updated last year
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆250Updated 4 months ago
- ☆46Updated 2 years ago
- ☆359Updated 8 months ago
- The VoxTube dataset official repository☆68Updated last year
- ☆100Updated 2 weeks ago
- Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch☆269Updated last year
- An unofficial PyTorch implementation of VALL-E☆87Updated last week
- Update ASR paper everyday☆208Updated this week
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆67Updated last year
- VoiceBench: Benchmarking LLM-Based Voice Assistants☆196Updated last week