alphacep/awesome-speech

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/alphacep/awesome-speech)

alphacep / awesome-speech

Resources that make every language unique

☆32

Alternatives and similar repositories for awesome-speech

Users that are interested in awesome-speech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xinjli / phonepiece
View on GitHub
phone inventory library
☆17May 15, 2023Updated 3 years ago
NTRLab / MediaSpeech
View on GitHub
☆22Jul 22, 2022Updated 4 years ago
nii-yamagishilab / speaker_sex_attribute_privacy
View on GitHub
Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE
☆15Nov 30, 2022Updated 3 years ago
alphacep / vosk-space
View on GitHub
Website and documentation
☆24May 4, 2026Updated 2 months ago
jisang93 / VISinger
View on GitHub
Unofficial pytorch implementation of VISinger: Variational Inference with Adversarial Learning for End-to-end Singing Voice Synthesis (IC…
☆20May 12, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Etherll / Timbre
View on GitHub
Extract a target speaker’s clean, non-overlapped speech from multi-speaker audio and export word-safe LJSpeech-style TTS datasets.
☆21Jun 14, 2026Updated last month
nvidia-riva / nemo2riva
View on GitHub
NeMo -> Riva Conversion Tool
☆26Nov 17, 2025Updated 8 months ago
pengzhendong / ngram-punctuator
View on GitHub
An N-gram punctuator for Chinese and English.
☆18Oct 14, 2025Updated 9 months ago
nmfisher / sherpa_onnx_dart
View on GitHub
Dart plugin wrapping the Sherpa-ONNX runtime. Contains example for speech recognition with Flutter
☆22Jan 3, 2025Updated last year
alphacep / nativescript-vosk
View on GitHub
Vosk Speech Recognition Plugin for Nativescript
☆21Oct 30, 2021Updated 4 years ago
nxoim / decomposite
View on GitHub
ARCHIVED INDEFINITELY (for api design reasons). Compose Multiplatform router-style navigation library, based on Decompose, with custom an…
☆12Nov 4, 2024Updated last year
WangHelin1997 / Aty-TTS
View on GitHub
Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
☆11May 14, 2025Updated last year
langswap-app / langswap
View on GitHub
Self-hosted AI video dubbing with ASR, translation, voice cloning, subtitles, and local GPU inference.
☆35Jun 22, 2026Updated last month
motazsaad / ara-pronunciation-tool
View on GitHub
A python tool that converts Arabic diacritised text to a sequence of phonemes and creates a pronunciation dictionary. This code is based …
☆15Sep 5, 2017Updated 8 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
alphacep / vosk-tts
View on GitHub
Text To Speech Synthesis with Vosk
☆268Jun 6, 2026Updated last month
SpeechClub / CDER_Metric
View on GitHub
CDER (Conversational Diarization Error Rate) Scoring Tool
☆22Sep 13, 2022Updated 3 years ago
leonjovanovic / keywords-extraction
View on GitHub
Keyword extraction using Scake, KeyBERT, Fine-tuning Transformer BERT-like models and ChatGPT.
☆12May 22, 2023Updated 3 years ago
alphacep / sherpa-onnx
View on GitHub
Real-time speech recognition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, R…
☆11Jan 29, 2026Updated 5 months ago
LAION-AI / scaled-echo-tts
View on GitHub
Scaled diffusion transformer for text-to-speech synthesis (DiT + T5Gemma2 conditioning, TorchTitan & Megatron backends, tested up to 1024…
☆24Mar 29, 2026Updated 3 months ago
samueljamesbell / sequence-labeler
View on GitHub
Neural network sequence labeling model
☆11Dec 28, 2019Updated 6 years ago
kyutai-labs / tts_longeval
View on GitHub
☆30Apr 29, 2026Updated 2 months ago
bookbot-hive / k2-indonesian-asr
View on GitHub
Indonesian speech/phoneme recognizer powered by Kaldi 2.0 (lhotse, icefall, sherpa).
☆16Jun 30, 2023Updated 3 years ago
AIRI-Institute / AI4TALK
View on GitHub
☆13Dec 7, 2022Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Takaaki-Saeki / zm-text-tts
View on GitHub
[IJCAI'23] Learning to Speak from Text for Low-Resource TTS
☆65May 30, 2023Updated 3 years ago
WhissleAI / PromptingNemo
View on GitHub
All-in-one Speech Transcription
☆11Jun 5, 2026Updated last month
alexisdmacintyre / SpeechBreathingToolbox
View on GitHub
Tools for the automatic detection of speech-related inhalation events and characterisation of the speech respiratory cycle.
☆11Feb 17, 2024Updated 2 years ago
ductuantruong / speaker_age_estimation_ssl_study
View on GitHub
[APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models
☆14Oct 19, 2022Updated 3 years ago
ZQuang2202 / Zipformer_Lightning
View on GitHub
An upgrade framework for train and validate compare with icefall using Lightning.
☆16Mar 26, 2025Updated last year
AranKomat / adapinp
View on GitHub
Unofficial implementation of Adaptive Input in PyTorch
☆12Feb 22, 2019Updated 7 years ago
TigreGotico / phoonnx
View on GitHub
A Python library for multilingual phonemization and Text-to-Speech (TTS) using ONNX models.
☆27Updated this week
speechcatcher-asr / speechcatcher-data
View on GitHub
☆11Sep 5, 2025Updated 10 months ago
ekapolc / gowajee_corpus
View on GitHub
Thai smart home corpus with "Gowajee" hotword
☆19Jul 30, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
alumae / voxlingua107_sb
View on GitHub
VoxLingua107 recipe for SpeechBrain
☆13Jul 3, 2021Updated 5 years ago
kyutai-labs / sphn
View on GitHub
python bindings for symphonia/opus - read various audio formats from python and write opus files
☆80Jan 7, 2026Updated 6 months ago
tabahi / contexless-phonemes-CUPE
View on GitHub
pytorch model for contexless-phoneme prediction from speech audio
☆32Oct 30, 2025Updated 8 months ago
efeslab / LiteASR
View on GitHub
[EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation
☆154May 18, 2025Updated last year
IntendedConsequence / vadc
View on GitHub
Uses the excellent silero VAD with onnxruntime C api for fast detection of audio segments with speech
☆16Sep 20, 2024Updated last year
alaershov / sample-mars-colony
View on GitHub
Decompose BottomSheet Sample
☆14Feb 16, 2026Updated 5 months ago
Picovoice / falcon
View on GitHub
On-device speaker diarization powered by deep learning
☆74Jul 2, 2026Updated 3 weeks ago