AI4Bharat / IndicF5Links
β40Updated 3 months ago
Alternatives and similar repositories for IndicF5
Users that are interested in IndicF5 are comparing it to the libraries listed below
Sorting:
- β157Updated 7 months ago
- π Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. π§π₯π Advanced audio processing.β247Updated last year
- β240Updated last month
- Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASRβ59Updated last month
- Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translationβ173Updated 2 months ago
- β272Updated last year
- π π€ Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloningβ160Updated last year
- A simple voice conversion toolβ17Updated 3 years ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.β194Updated 2 months ago
- πΌ Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decompositionβ15Updated last year
- Towards Building Text-To-Speech Systems for the Next Billion Users - Microsoft Research Intern Work - Accepted at ICASSP 2023β54Updated 2 years ago
- Indic TTS for Indian Languages: This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages, improving quβ¦β35Updated 3 weeks ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.β63Updated last month
- Indic TTS for Indian Languages: This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages, improving quβ¦β14Updated last year
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusionβ184Updated 9 months ago
- Analysis of XLS-R for Speech Quality Assessmentβ13Updated 5 months ago
- Text-to-Speech for languages of Indiaβ258Updated 8 months ago
- This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.β82Updated 8 months ago
- Mispronunciation Detection using a pretrained and finetuned wav2vec2 model for phoneme recognition and diagnosis and feedback using largeβ¦β26Updated last year
- β61Updated 11 months ago
- Deep Learning model for lexical stress detection in spoken Englishβ29Updated 5 years ago
- Speaker diarization modelβ28Updated 2 years ago
- Google's SoundStorm: Efficient Parallel Audio Generationβ132Updated last year
- Building a Deep learning model that predicts the gender of a speaker using TensorFlow 2β126Updated 2 years ago
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"β62Updated 8 months ago
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.β41Updated 2 weeks ago
- π Awesome lists about Speech Emotion Recognitionβ93Updated 6 months ago
- A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTSβ43Updated 7 months ago
- A lightweight, efficient variation of the StyleTTSβ―2 textβtoβspeech model.β35Updated last month
- Official Implementation of StyleTTSβ439Updated 6 months ago