pr0mila / MediBeng-Whisper-TinyLinks

MediBeng Whisper Tiny improves doctor-patient transcription by training the Whisper Tiny model to translate mixed Bengali-English speech into English, making it easier for analysis, record-keeping, and using AI in healthcare.

☆18

Alternatives and similar repositories for MediBeng-Whisper-Tiny

Users that are interested in MediBeng-Whisper-Tiny are comparing it to the libraries listed below

Sorting:

clement-pages / gryannote
Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.
☆62Updated last week
EndlessReform / smoltts
Open TTS models, built for streaming on the edge
☆43Updated 2 months ago
thevoicecompany / gazelle-train
Joint speech-language model - respond directly to audio!
☆30Updated last year
hitz-zentroa / whisper-lm
Add n-gram and large language model support to Whisper models.
☆19Updated last month
hlt-mt / mosel
Collection of Open Source Speech Data
☆158Updated 6 months ago
fakerybakery / simpletts
A lightweight Python library for running TTS models with a unified API.
☆18Updated 3 months ago
indri-voice / audiotoken
Audio tokenization, in the fastest way possible!
☆52Updated 9 months ago
utter-project / fairseq
This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.
☆17Updated 6 months ago
AI4Bharat / IndicVoices-R
A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS
☆40Updated 5 months ago
knoriy / CLARA
☆62Updated 10 months ago
egorsmkv / asr-corpus-creator
This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.
☆27Updated last year
NeuralVox / OpenPhonemizer
An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…
☆98Updated 7 months ago
rendchevi / daisy-tts
🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition
☆15Updated last year
linto-ai / linto-diarization
Speaker diarization service
☆23Updated last month
andrewsilva9 / tune_tortoise_autoregressor
Fine tuning the UnifiedVoice autoregressor for TortoiseTTS.
☆15Updated last year
ictnlp / LLaMA-Omni2
☆174Updated 2 weeks ago
mesolitica / vllm-whisper
A high-throughput and memory-efficient inference and serving engine for Whisper, https://mesolitica.com/blog/vllm-whisper
☆27Updated 10 months ago
kyutai-labs / dactory
☆37Updated last month
huggingface / diarizers
☆294Updated 11 months ago
kyutai-labs / moshi-finetune
☆227Updated 2 months ago
taresh18 / TTSizer
🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨
☆40Updated 2 weeks ago
sanchit-gandhi / whisper-flash-attention
☆20Updated 2 years ago
huggingface / open_asr_leaderboard
☆103Updated last week
shivammehta25 / OverFlow
Putting flows on top of neural transducers for better TTS
☆62Updated last week
IIEleven11 / Automatic-Audio-Dataset-Maker
Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.
☆36Updated last week
yuriak / SpeechDialogueFactory
☆32Updated 2 months ago
FrenchKrab / IS2023-powerset-diarization
Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.
☆83Updated last year
xincanfeng / vitsGPT
☆57Updated 11 months ago
ryota-komatsu / speaker_disentangled_hubert
Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"
☆38Updated this week
LAION-AI / Text-to-speech
☆60Updated last year