AI4Bharat / IndicF5
☆14Updated 3 weeks ago
Alternatives and similar repositories for IndicF5:
Users that are interested in IndicF5 are comparing it to the libraries listed below
- Indic TTS for Indian Languages: This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages, improving qu…☆28Updated 3 weeks ago
- Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)☆77Updated 10 months ago
- Finetune VITS and MMS using HuggingFace's tools☆145Updated last year
- 🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.☆242Updated 10 months ago
- ☆130Updated 4 months ago
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion☆174Updated 6 months ago
- ☆216Updated last month
- Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.☆302Updated last year
- Update ASR paper everyday☆196Updated this week
- Efficient approach to speaker diarization using voice characteristics extraction☆91Updated last year
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆148Updated 11 months ago
- Create an LJSpeech structured voice dataset on wave input☆28Updated 6 months ago
- On-device streaming text-to-speech engine powered by deep learning☆76Updated this week
- Real-time Speech-Text Foundation Model Toolkit (wip)☆224Updated last month
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆135Updated last year
- VoiceBench: Benchmarking LLM-Based Voice Assistants☆180Updated this week
- Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)☆409Updated this week
- This project is about performing Speaker diarization for Hindi Language.☆49Updated 4 years ago
- ☆26Updated 3 weeks ago
- [Interspeech 2024] Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation☆150Updated 2 months ago
- Speaker diarization model☆27Updated 2 years ago
- Text-to-Speech for languages of India☆234Updated 5 months ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆75Updated this week
- ☆269Updated 10 months ago
- ☆39Updated last year
- Text to speech alignment using CTC forced alignment☆270Updated last month
- ☆356Updated 7 months ago
- WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)☆47Updated 9 months ago
- FastAPI service on top of WhisperX☆85Updated this week
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection☆680Updated 4 months ago