The-Data-Dilemma/MediBeng-Whisper-Tiny

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/The-Data-Dilemma/MediBeng-Whisper-Tiny)

The-Data-Dilemma / MediBeng-Whisper-Tiny

MediBeng Whisper Tiny improves doctor-patient transcription by training the Whisper Tiny model to translate mixed Bengali-English speech into English, making it easier for analysis, record-keeping, and using AI in healthcare.

☆29

Alternatives and similar repositories for MediBeng-Whisper-Tiny

Users that are interested in MediBeng-Whisper-Tiny are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

nethermanpro / ComSL
View on GitHub
☆11Oct 14, 2023Updated 2 years ago
YLQY / WhisperMultitaskFinetuning
View on GitHub
关于Whisper语音大模型的多任务微调
☆16Oct 3, 2024Updated last year
ictnlp / ComSpeech
View on GitHub
Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".
☆27Jul 2, 2024Updated 2 years ago
ictnlp / CMOT
View on GitHub
Code for ACL 2023 main conference paper "CMOT: Cross-modal Mixup via Optimal Transport for Speech Translation"
☆17Oct 29, 2024Updated last year
davila7 / fintual_mcp_server
View on GitHub
Fintual MCP Server
☆22May 23, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
bootphon / learnable-strf
View on GitHub
Learnable STRF, from Riad et al. 2021 JASA
☆13Aug 21, 2021Updated 4 years ago
NUS-HPC-AI-Lab / MoST
View on GitHub
MoST: Mixing Speech and Text with Modality-Aware Mixture of Experts
☆33Jan 15, 2026Updated 6 months ago
cyhuang-tw / robust-vc
View on GitHub
☆11May 7, 2022Updated 4 years ago
cpii-cai / PunCantonese
View on GitHub
A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts
☆15Dec 3, 2024Updated last year
ORI-Muchim / Efficient-Speech
View on GitHub
Lightweight Korean TTS Model based on FastSpeech2
☆15Mar 4, 2026Updated 4 months ago
MegEngine / End-to-end-ASR-Transformer
View on GitHub
An end to end ASR Transformer model training repo
☆13Dec 8, 2021Updated 4 years ago
viswavi / languageid
View on GitHub
Identifying the language of input text using character-level n-grams, with support for 45 languages
☆11Dec 26, 2022Updated 3 years ago
dqqcasia / mosst
View on GitHub
☆27Aug 31, 2022Updated 3 years ago
iamanigeeit / present
View on GitHub
☆14Aug 19, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
thuml / DARE
View on GitHub
Code release for "Long-Sequence Recommendation Models Need Decoupled Embeddings" (ICLR 2025), https://arxiv.org/abs/2410.02604
☆26Mar 5, 2025Updated last year
lucky-bai / wasm-speech-streaming
View on GitHub
Offline streaming speech-to-text in the browser
☆25Aug 28, 2025Updated 10 months ago
bshall / dusted
View on GitHub
DUSTED: Spoken-Term Discovery using Discrete Speech Units
☆17Oct 2, 2024Updated last year
r3eckon / AutoTilesetGenerator
View on GitHub
AutoTile tileset generator for Unity
☆10Jul 5, 2019Updated 7 years ago
ttslr / MonTTS
View on GitHub
☆16Dec 23, 2021Updated 4 years ago
RustedBytes / extract-audio
View on GitHub
Extract audio files from a parquet or arrow file generated by Hugging Face `datasets` library.
☆17Jun 21, 2026Updated 3 weeks ago
luomingshuang / k2-speechbrain
View on GitHub
In this repository, I try to combine k2 with speechbrain to decode well and fastly.
☆16Jun 17, 2022Updated 4 years ago
daanzu / py-silero-vad-lite
View on GitHub
Lightweight wrapper for Silero VAD using internal ONNX Runtime and with no python package dependencies
☆17Nov 25, 2024Updated last year
RustedBytes / rf-detr-usls
View on GitHub
RF-DETR + USLS: object detection using Rust
☆15Apr 12, 2025Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
ictnlp / CRESS
View on GitHub
Code for ACL 2023 main conference paper "Understanding and Bridging the Modality Gap for Speech Translation".
☆16Oct 25, 2023Updated 2 years ago
2404589803 / hf-daily-paper-newsletter-multilingual
View on GitHub
🤖 A multilingual translation tool that automatically converts Hugging Face's daily AI research papers into 🇯🇵 Japanese, 🇰🇷 Korean, �…
☆18Updated this week
AidenAI-IO / aiden-firmware
View on GitHub
AI Agent hardware for mobile phone
☆18Updated this week
Glaciohound / Chimera-ST
View on GitHub
A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021
☆47Feb 21, 2022Updated 4 years ago
kaistmm / AdaptVC
View on GitHub
☆17Jun 2, 2025Updated last year
ogunlao / glowtts_stdp
View on GitHub
Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor
☆19Jun 5, 2023Updated 3 years ago
zszheng147 / VoiceCraft-X
View on GitHub
☆40Nov 18, 2025Updated 8 months ago
NTRLab / MediaSpeech
View on GitHub
☆22Jul 22, 2022Updated 3 years ago
walker-hyf / ECSS
View on GitHub
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)
☆59Jun 20, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ozspeech / OZSpeech
View on GitHub
[ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
☆45Feb 9, 2025Updated last year
dqqcasia / st
View on GitHub
End-to-end Speech Translation
☆35Apr 12, 2021Updated 5 years ago
thu-spmi / CTC-TTS
View on GitHub
Code for CTC-TTS: LLM-based dual-streaming text-to-speech with CTC alignment, Interspeech 2026.
☆20Jun 9, 2026Updated last month
ysharma3501 / LayaCodec
View on GitHub
High fidelity neural audio codec for TTS models
☆36Dec 22, 2025Updated 6 months ago
SparkBarcelona / libro
View on GitHub
Repositorios de código para el libro "Introducción a Apache Spark para empezar a programar el Big Data"
☆14Nov 22, 2015Updated 10 years ago
200sc / go-compgeo
View on GitHub
A Computational Geometry library in Go
☆12Feb 7, 2018Updated 8 years ago
ictnlp / DiSeg
View on GitHub
Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"
☆37Dec 6, 2023Updated 2 years ago