naver/multilingual-distilwhisper

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/naver/multilingual-distilwhisper)

naver / multilingual-distilwhisper

This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.

☆34

Alternatives and similar repositories for multilingual-distilwhisper

Users that are interested in multilingual-distilwhisper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cpii-cai / PunCantonese
View on GitHub
A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts
☆15Dec 3, 2024Updated last year
arnabdas8901 / StarGAN-VC_PlusPlus
View on GitHub
☆11Aug 11, 2023Updated 2 years ago
iamanigeeit / present
View on GitHub
☆14Aug 19, 2024Updated last year
tuanct1997 / Federated-Learning-ASR-based-on-wav2vec-2.0
View on GitHub
☆18Mar 13, 2024Updated 2 years ago
kaistmm / voxceleb-disentangler
View on GitHub
[INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…
☆18Jul 23, 2024Updated 2 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
Koziev / StressModel
View on GitHub
Neural model for prediction of stress position in Russian words
☆13Jun 22, 2025Updated last year
TTS-Research / PEL-TTS
View on GitHub
☆14Aug 16, 2023Updated 2 years ago
wentaozhu / speechnas
View on GitHub
SpeechNAS-Better-Trade-off-between-Latency-and-Accuracy-for-Large-Scale-Speaker-Verification
☆30Mar 24, 2023Updated 3 years ago
Audio-WestlakeU / Mel-McNet
View on GitHub
The Official PyTorch Implementation of "Mel-McNet: A Mel-Scale Framework for Online Multichannel Speech Enhancement" [Interspeech 2025]
☆26May 14, 2026Updated 2 months ago
efeslab / LiteASR
View on GitHub
[EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation
☆154May 18, 2025Updated last year
xinjli / asr2k
View on GitHub
asr2k
☆51Jun 2, 2024Updated 2 years ago
ductuantruong / speaker_age_estimation_ssl_study
View on GitHub
[APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models
☆14Oct 19, 2022Updated 3 years ago
skit-ai / Map-Mix
View on GitHub
The official implementation of the method discussed in the paper Improving Spoken Language Identification with Map-Mix(work accepted at I…
☆18Feb 17, 2023Updated 3 years ago
hitz-zentroa / whisper-lm
View on GitHub
Add n-gram and large language model (LLM) support to Whisper models.
☆43May 6, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
AIRI-Institute / AI4TALK
View on GitHub
☆13Dec 7, 2022Updated 3 years ago
lingjzhu / zipa
View on GitHub
A family of efficient speech models for multilingual phone recognition
☆68Jul 18, 2026Updated last week
NTRLab / MediaSpeech
View on GitHub
☆22Jul 22, 2022Updated 4 years ago
burrmill / burrmill
View on GitHub
BurrMill core
☆22Nov 2, 2021Updated 4 years ago
microsoft / NoAudioCaptioning
View on GitHub
Repository for "Training Audio Captioning Models without Audio"
☆10Sep 26, 2023Updated 2 years ago
declare-lab / HyperTTS
View on GitHub
☆40Apr 15, 2024Updated 2 years ago
AI4Bharat / IndicVoices-R
View on GitHub
A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS
☆64Dec 11, 2024Updated last year
SSTC-Challenge / SSTC2024_baseline_system
View on GitHub
☆12Jun 14, 2024Updated 2 years ago
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
zds-potato / multilingual-phonetic-sv
View on GitHub
☆10Dec 22, 2023Updated 2 years ago
ShoukanLabs / VoPho
View on GitHub
A collection of all our phonemeizers for dataset construction and inference
☆30Feb 21, 2025Updated last year
xinjli / transphone
View on GitHub
phoneme tokenizer and grapheme-to-phoneme model for 8k languages
☆174Jun 9, 2023Updated 3 years ago
WelkinYang / EMPHASIS-pytorch
View on GitHub
EMPHASIS: An Emotional Phoneme-based Acoustic Model for Speech Synthesis System
☆15Mar 31, 2019Updated 7 years ago
google-research-datasets / WikipediaHomographData
View on GitHub
Labeled data for homograph disambiguation
☆62Jun 1, 2023Updated 3 years ago
fgnt / speaker_reassignment
View on GitHub
Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment
☆14Feb 5, 2025Updated last year
NikolaiKyhne / RWSAMamba-UNet
View on GitHub
Official repository for the paper "Exploring Resolution-Wise Shared Attention in Hybrid Mamba-U-Nets for Improved Cross-Corpus Speech Enh…
☆19May 5, 2026Updated 2 months ago
kdrkdrkdr / JK-VITS
View on GitHub
Bilingual-TTS (Japanese and Korean)
☆32Jul 1, 2023Updated 3 years ago
nii-yamagishilab / speaker_sex_attribute_privacy
View on GitHub
Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE
☆15Nov 30, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
mt-upc / ZeroSwot
View on GitHub
Pushing the Limits of Zero-shot End-to-End Speech Translation
☆25Dec 12, 2024Updated last year
ictnlp / NAST-S2x
View on GitHub
A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.
☆78Oct 22, 2024Updated last year
MaxMax2016 / Glow-SVC
View on GitHub
4G GPU & 10 Minutes for train
☆12Aug 9, 2023Updated 2 years ago
ogunlao / glowtts_stdp
View on GitHub
Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor
☆19Jun 5, 2023Updated 3 years ago
vTAD2025-Challenge / vTAD
View on GitHub
☆17Oct 24, 2025Updated 9 months ago
leto19 / WhiSQA
View on GitHub
Whisper Speech Quality Assessment (WhiSQA)
☆16Apr 14, 2026Updated 3 months ago
AkshathRaghav / tinyspeech
View on GitHub
Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"
☆23Jun 7, 2025Updated last year