mobassir94 / comprehensive-bangla-ttsLinks
Aiming to achieve ultimate Multilingual TTS pipeline with main focus on releasing COQUIπΈTTS(Text-to-Speech) based high performing neural voice cloning systems for Bangla for the first time, supporting different SOTA models for Bangla and also Multilingual (Arabic+Bengali) code mixed TTS pipeline.
β41Updated last year
Alternatives and similar repositories for comprehensive-bangla-tts
Users that are interested in comprehensive-bangla-tts are comparing it to the libraries listed below
Sorting:
- Text to Speech for Indic languagesβ51Updated 3 years ago
- Transformer based Bangla Speech Recognitionβ53Updated 2 years ago
- Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio and transcripts for personalized text to sβ¦β28Updated 2 years ago
- Automatic Context Sensitive Spelling Correction for Bangla Text Using Bert and Levenstein Distanceβ20Updated 6 months ago
- Bangla TTS Inference pipeline using Vit TTSβ8Updated last year
- Bangla Unicode Normalizationβ20Updated last year
- Whisper finetuned on VinBigdata-VLSP2020-100h + KenLMβ39Updated last year
- A python package for whisper normalizerβ60Updated 3 weeks ago
- NPTEL2020: Speech2Text dataset for Indian-English Accentβ76Updated 3 years ago
- β43Updated 2 years ago
- Towards Building Text-To-Speech Systems for the Next Billion Users - Microsoft Research Intern Work - Accepted at ICASSP 2023β54Updated 2 years ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.β27Updated last year
- asr2kβ50Updated last year
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.β12Updated 2 years ago
- KATube is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. From a lβ¦β23Updated 10 months ago
- This will hold the data pipeline to convert raw audio data to speech which will act as input dataset for speech-to-text pipelineβ32Updated 2 years ago
- pytorch implementation for MultiSpeech: Multi-Speaker Text to Speech with Transformer paperβ21Updated 2 years ago
- Repository for fine-tuning Transformers π€ based seq2seq speech models in JAX/Flax.β36Updated 2 years ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription usingβ¦β28Updated 2 years ago
- β46Updated 2 years ago
- Zero-shot Audio Classification using Whisperβ79Updated 2 years ago
- Stable timestamps and confidence score for words of OpenAI's Whisper outputs down to word-level.β25Updated 2 years ago
- create dataset from list of youtube links easilyβ18Updated 2 years ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPβ¦β98Updated 7 months ago
- Generative voice cloning model using TTS synthesis with state-of-the-art Zero-Shot Multi-Speaker functionality. An web api built with theβ¦β47Updated 2 years ago
- β17Updated 4 years ago
- A simple voice conversion toolβ17Updated 3 years ago
- β40Updated last month
- Repository containing experimentation platform on how to train, infer on wav2vec2 models.β87Updated 2 years ago
- β76Updated 3 years ago