kamperh/speech_dtw

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kamperh/speech_dtw)

kamperh / speech_dtw

Dynamic time warping (DTW) functions for specifically speech alignment.

☆30

Alternatives and similar repositories for speech_dtw

Users that are interested in speech_dtw are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

dan-wells / kiss-aligner
View on GitHub
Simple Kaldi recipe for forced alignment
☆11Jul 16, 2023Updated 3 years ago
kamperh / speech_correspondence
View on GitHub
Correspondence and autoencoder neural network training for speech using Pylearn2.
☆14Dec 9, 2015Updated 10 years ago
bshall / dusted
View on GitHub
DUSTED: Spoken-Term Discovery using Discrete Speech Units
☆17Oct 2, 2024Updated last year
NickRuiz / power-asr
View on GitHub
Phonetically-Oriented Word Error Rate
☆36May 4, 2019Updated 7 years ago
awasthiabhijeet / Error-Driven-ASR-Personalization
View on GitHub
Code for "Error-driven Fixed-Budget ASR Personalization for Accented Speakers" in ICASSP 2021
☆11Jun 13, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
talhanai / kaldi-diar-latte
View on GitHub
steps to perform text-based speaker diarization with kaldi toolkit
☆12Nov 2, 2018Updated 7 years ago
abuccts / wikt2pron
View on GitHub
A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict format
☆34Jul 5, 2019Updated 7 years ago
qiujiali / lattice_rnn
View on GitHub
Bi-directional Lattice Recurrent Neural Networks for Confidence Estimation
☆15Aug 28, 2020Updated 5 years ago
stefanocoretta / speakr
View on GitHub
speakr: A Wrapper for the Phonetic Software Praat
☆27Feb 28, 2026Updated 4 months ago
motazsaad / ara-pronunciation-tool
View on GitHub
A python tool that converts Arabic diacritised text to a sequence of phonemes and creates a pronunciation dictionary. This code is based …
☆15Sep 5, 2017Updated 8 years ago
kamperh / globalphone_awe
View on GitHub
Multilingual acoustic word embedding approaches applied and evaluated on GlobalPhone data.
☆11Nov 3, 2020Updated 5 years ago
JazminVidal / gop-ft
View on GitHub
Transfer learning approach to pronunciation scoring
☆12Jan 17, 2024Updated 2 years ago
HaskinsLabs / get_vot
View on GitHub
☆11May 14, 2017Updated 9 years ago
luomingshuang / k2-speechbrain
View on GitHub
In this repository, I try to combine k2 with speechbrain to decode well and fastly.
☆16Jun 17, 2022Updated 4 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
jayneelparekh / sp2si-code
View on GitHub
Contains code for our work on speech to singing conversion (ICASSP 2020)
☆50Oct 27, 2020Updated 5 years ago
aguai / pyin
View on GitHub
pYIN (Probabilistic YIN) is a modification of the well-loved YIN algorithm for fundamental frequency (F0) estimation in monophonic audio.…
☆33Aug 27, 2023Updated 2 years ago
athena-team / athena-transform
View on GitHub
☆21Jan 13, 2020Updated 6 years ago
google-research / last
View on GitHub
A JAX library for building lattice-based speech transducer models
☆48Jul 2, 2026Updated 3 weeks ago
JazminVidal / gop-pykaldi
View on GitHub
Goodness of Pronunciation algorithm using PyKaldi
☆18Jun 12, 2022Updated 4 years ago
sigmorphon / 2020
View on GitHub
SIGMORPHON 2020 Shared Task: Grapheme-to-Phoneme, Unsupervised Induction of Morphology, and Typologically Diverse Morphological Inflectio…
☆36Apr 25, 2025Updated last year
frank613 / CTC-based-GOP
View on GitHub
This repo related to the paper "A Framework for Phoneme-Level Pronunciation Assessment Using CTC" for INTERSPEECH2024
☆41Feb 5, 2026Updated 5 months ago
yuhaozhang / nnjm-global
View on GitHub
A python implementation of the neural network joint language model and an extension of it using global source context.
☆11May 17, 2017Updated 9 years ago
srinivr / kaldi-long-audio-alignment
View on GitHub
Long audio alignment using Kaldi
☆23Apr 22, 2021Updated 5 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
TehreemFarooqi / Preparing-a-speech-recognition-dataset-using-YouTube-videos
View on GitHub
Using YouTube to prepare a speech recognition dataset for any language
☆10Mar 30, 2021Updated 5 years ago
dogancan / expected-edit-distance
View on GitHub
Expected edit distance implementation using OpenFst tools
☆11May 13, 2015Updated 11 years ago
Kaljurand / EKISpeak
View on GitHub
Implementation of Android's TextToSpeechService that provides Estonian text-to-speech
☆17Jan 19, 2019Updated 7 years ago
markusdr / transducersaurus
View on GitHub
Automatically exported from code.google.com/p/transducersaurus
☆11Apr 1, 2015Updated 11 years ago
patrickvonplaten / Wav2Vec2_ParlanceCTCDecode
View on GitHub
☆11Nov 5, 2021Updated 4 years ago
Refefer / word2vec-scala
View on GitHub
Scala port of the word2vec toolkit.
☆11Aug 15, 2016Updated 9 years ago
danijel3 / ClarinStudioKaldi
View on GitHub
A baseline Automatic Speech Recognition system for Polish based on Kaldi.
☆18Dec 21, 2021Updated 4 years ago
Akshat4112 / voicenet
View on GitHub
Comprehensive Python library for speech and voice.
☆32Dec 8, 2022Updated 3 years ago
soupdtag / speak-tool
View on GitHub
A tool to collect/validate audio recordings from workers on Amazon Mechanical Turk. Written in Python/Flask. (originally hosted on github…
☆16Dec 19, 2022Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
charlesliucn / LanMIT
View on GitHub
📖 LanMIT: A Toolkit for Improving Language Models in Low-resourced Speech Recognition based on Kaldi.
☆22Jul 12, 2019Updated 7 years ago
BornInWater / Overlap-Detection
View on GitHub
Overlapped Speech detection in Multi-party Conversations
☆22Feb 20, 2018Updated 8 years ago
misskaseyann / acoustic-event-detection
View on GitHub
Acoustic event detection using recurrent neural networks.
☆11Sep 4, 2018Updated 7 years ago
speechio / asr-noises
View on GitHub
A handy dataset of noises for ASR
☆22May 29, 2019Updated 7 years ago
ehsanasgari / 1000Langs
View on GitHub
Creating super-parallel corpora of more than 1500+ unique languages for NLP research
☆33Dec 8, 2022Updated 3 years ago
Chung-I / youtube-asr-crawler
View on GitHub
☆10Sep 19, 2022Updated 3 years ago
projecte-aina / oTranscribe-plus
View on GitHub
A free & open tool for transcribing audio interviews with offline ASR support
☆25Dec 21, 2023Updated 2 years ago