lumaku/ctc-segmentation

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lumaku/ctc-segmentation)

lumaku / ctc-segmentation

Segment an audio file and obtain utterance alignments. (Python package)

☆348

Alternatives and similar repositories for ctc-segmentation

Users that are interested in ctc-segmentation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cornerfarmer / ctc_segmentation
View on GitHub
Segment a given audio into utterances using a trained end-to-end ASR model.
☆75Oct 9, 2020Updated 5 years ago
lingjzhu / charsiu
View on GitHub
Charsiu: A neural phonetic aligner.
☆347Sep 19, 2022Updated 3 years ago
kensho-technologies / pyctcdecode
View on GitHub
A fast and lightweight python-based CTC beam search decoder for speech recognition.
☆469Jul 13, 2023Updated 3 years ago
farisalasmary / wav2vec2-kenlm
View on GitHub
Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding
☆74Oct 11, 2021Updated 4 years ago
SpeechColab / GigaSpeech
View on GitHub
Large, modern dataset for speech recognition
☆731Feb 26, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
k2-fsa / fast_rnnt
View on GitHub
A torch implementation of a recursion which turns out to be useful for RNN-T.
☆149Aug 25, 2023Updated 2 years ago
facebookresearch / WavAugment
View on GitHub
A library for speech data augmentation in time-domain
☆689Aug 30, 2021Updated 4 years ago
hirofumi0810 / neural_sp
View on GitHub
End-to-end ASR/LM implementation with PyTorch
☆594Aug 30, 2021Updated 4 years ago
cywang97 / StreamingTransformer
View on GitHub
☆277Jan 15, 2021Updated 5 years ago
burchim / EfficientConformer
View on GitHub
[ASRU 2021] Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition
☆221Jun 22, 2023Updated 3 years ago
neosapience / editts
View on GitHub
Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech (INTERSPEECH 2022)
☆122Jan 24, 2023Updated 3 years ago
xinjli / alqalign
View on GitHub
multilingual speech aligner
☆78Nov 19, 2023Updated 2 years ago
CUNY-CL / wikipron
View on GitHub
Massively multilingual pronunciation mining
☆371Updated this week
mozilla / DSAlign
View on GitHub
DeepSpeech based forced alignment tool
☆239Dec 12, 2020Updated 5 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
tencent-ailab / pika
View on GitHub
a lightweight speech processing toolkit based on Pytorch and (Py)Kaldi
☆354Dec 25, 2020Updated 5 years ago
sarulab-speech / jtubespeech
View on GitHub
☆233Nov 13, 2023Updated 2 years ago
s3prl / s3prl
View on GitHub
Self-Supervised Speech Pre-training and Representation Learning Toolkit
☆2,557Mar 12, 2026Updated 4 months ago
k2-fsa / k2
View on GitHub
FSA/FST algorithms, differentiable, with PyTorch compatibility.
☆1,348Jul 11, 2026Updated 2 weeks ago
MontrealCorpusTools / Montreal-Forced-Aligner
View on GitHub
Command line utility for forced alignment using Kaldi
☆1,852Jul 11, 2026Updated 2 weeks ago
YiwenShaoStephen / pychain
View on GitHub
PyTorch implementation of LF-MMI for End-to-end ASR
☆221Jan 14, 2021Updated 5 years ago
axelspringer / DeepPhonemizer
View on GitHub
Grapheme to phoneme conversion with deep learning.
☆432Dec 8, 2023Updated 2 years ago
facebookresearch / vocoder-benchmark
View on GitHub
A repository for benchmarking neural vocoders by their quality and speed.
☆213May 30, 2025Updated last year
Kyubyong / g2p
View on GitHub
g2p: English Grapheme To Phoneme Conversion
☆926Jan 5, 2023Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
nvidia-riva / riva-asrlib-decoder
View on GitHub
Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva
☆91Feb 18, 2025Updated last year
lingjzhu / CharsiuG2P
View on GitHub
Multilingual G2P in 100 languages
☆390May 26, 2023Updated 3 years ago
lhotse-speech / lhotse
View on GitHub
Tools for handling multimodal data in machine learning projects.
☆1,143Jun 22, 2026Updated last month
csukuangfj / kaldifeat
View on GitHub
Kaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - P…
☆215Jul 10, 2026Updated 2 weeks ago
sp-nitech / diffsptk
View on GitHub
A differentiable version of SPTK
☆201Jul 14, 2026Updated last week
jonatasgrosman / huggingsound
View on GitHub
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
☆470Sep 20, 2023Updated 2 years ago
miccio-dk / NISQA
View on GitHub
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
☆16Apr 13, 2022Updated 4 years ago
coqui-ai / open-speech-corpora
View on GitHub
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
☆1,397Jun 6, 2024Updated 2 years ago
facebookresearch / voxpopuli
View on GitHub
A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation
☆574Apr 2, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
bootphon / phonemizer
View on GitHub
Simple text to phones converter for multiple languages
☆1,558Sep 26, 2024Updated last year
iver56 / torch-audiomentations
View on GitHub
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
☆1,161Nov 24, 2025Updated 8 months ago
asappresearch / wav2seq
View on GitHub
Official code for Wav2Seq
☆97Jul 19, 2022Updated 4 years ago
espnet / espnet_onnx
View on GitHub
Onnx wrapper for espnet infrernce model
☆169Aug 11, 2025Updated 11 months ago
thu-spmi / CAT
View on GitHub
CAT is more than a CRF-based ASR toolkit: it provides a complete workflow for data-efficient end-to-end ASR, supporting CTC, CTC-CRF, RNN…
☆368Feb 5, 2026Updated 5 months ago
BridgetteSong / ExpressiveTacotron
View on GitHub
This repository provides a multi-mode and multi-speaker expressive speech synthesis framework, including multi-attentive Tacotron, DurIAN…
☆74Sep 21, 2022Updated 3 years ago
maum-ai / univnet
View on GitHub
Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)
☆286Oct 8, 2021Updated 4 years ago