Indonesian speech/phoneme recognizer powered by Kaldi 2.0 (lhotse, icefall, sherpa).
☆15Jun 30, 2023Updated 2 years ago
Alternatives and similar repositories for k2-indonesian-asr
Users that are interested in k2-indonesian-asr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Oct 19, 2022Updated 3 years ago
- ☆21Jul 22, 2022Updated 3 years ago
- Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE☆15Nov 30, 2022Updated 3 years ago
- An upgrade framework for train and validate compare with icefall using Lightning.☆16Mar 26, 2025Updated last year
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆11Nov 5, 2021Updated 4 years ago
- A python tool that converts Arabic diacritised text to a sequence of phonemes and creates a pronunciation dictionary. This code is based …☆16Sep 5, 2017Updated 8 years ago
- Adaptive Multimodal Reasoning via Reinforcement Learning☆23Jan 11, 2026Updated 4 months ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Apr 13, 2022Updated 4 years ago
- A playground for experimenting with acoustic echo cancellation using a microphone, speaker, and ONNX.☆13Oct 22, 2024Updated last year
- phone inventory library☆17May 15, 2023Updated 3 years ago
- ☆13Dec 7, 2022Updated 3 years ago
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Mar 24, 2023Updated 3 years ago
- Tools for the automatic detection of speech-related inhalation events and characterisation of the speech respiratory cycle.☆11Feb 17, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind☆65Sep 22, 2025Updated 8 months ago
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated last year
- A simple command line tool to calculate WER for ASR.☆14Oct 14, 2024Updated last year
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆36Aug 30, 2025Updated 9 months ago
- Thai smart home corpus with "Gowajee" hotword☆19Jul 30, 2023Updated 2 years ago
- Audio Diarization Annotation tool☆30Nov 8, 2019Updated 6 years ago
- ☆10Sep 18, 2017Updated 8 years ago
- ☆26Mar 20, 2024Updated 2 years ago
- Unofficial pytorch implementation of VISinger: Variational Inference with Adversarial Learning for End-to-end Singing Voice Synthesis (IC…☆20May 12, 2023Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [AAAI 2026 & ACL 2026] The official implementation of the DIFFA series for dLLM-based large audio language model☆79Apr 7, 2026Updated last month
- Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)☆12Mar 12, 2024Updated 2 years ago
- Crowdsourced and Automatic Speech Prominence Estimation☆26Apr 12, 2024Updated 2 years ago
- Resources that make every language unique☆28May 8, 2026Updated 3 weeks ago
- Train a fiwGAN or ciwGAN model using your own training data☆14Oct 13, 2022Updated 3 years ago
- ☆23Oct 17, 2024Updated last year
- ☆25Jan 14, 2021Updated 5 years ago
- 来自于文章Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition☆29Nov 20, 2024Updated last year
- Streamable Text-to-Speech model using a language modeling approach, without vector quantization☆109May 20, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Reimplementation of Miipher☆30Aug 16, 2023Updated 2 years ago
- A list of papers for child ASR☆54Oct 8, 2024Updated last year
- Extract phoneme-level timestamps from speeh audio.☆134May 12, 2026Updated 2 weeks ago
- MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline. (Accepted by IALP'2022)☆23Dec 5, 2022Updated 3 years ago
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆33Apr 22, 2026Updated last month
- 树莓派qwen-omni语音助手免TTS/STT☆18Apr 4, 2025Updated last year
- Dictionary of pairs of Korean word and IPA crawled from Wiktionary (Korean edition)☆23Nov 12, 2025Updated 6 months ago