talhanai/kaldi-diar-latte

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/talhanai/kaldi-diar-latte)

talhanai / kaldi-diar-latte

steps to perform text-based speaker diarization with kaldi toolkit

☆12

Alternatives and similar repositories for kaldi-diar-latte

Users that are interested in kaldi-diar-latte are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tiro-is / tiro-speech-core
View on GitHub
This is a mirror of https://gitlab.com/tiro-is/tiro-speech-core
☆15Jun 19, 2023Updated 3 years ago
TehreemFarooqi / Preparing-a-speech-recognition-dataset-using-YouTube-videos
View on GitHub
Using YouTube to prepare a speech recognition dataset for any language
☆10Mar 30, 2021Updated 5 years ago
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Updated this week
patrickvonplaten / Wav2Vec2_ParlanceCTCDecode
View on GitHub
☆11Nov 5, 2021Updated 4 years ago
burrmill / burrmill
View on GitHub
BurrMill core
☆22Nov 2, 2021Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
aalto-speech / subword-kaldi
View on GitHub
Properly handle position-dependent phones in a subword lexicon FST
☆31Oct 26, 2020Updated 5 years ago
alumae / streaming-punctuator
View on GitHub
☆17Apr 14, 2023Updated 3 years ago
m-wiesner / nnet_pytorch
View on GitHub
Kaldi style neural network training in pytorch for use in place of nnet3 in Kaldi.
☆26Jul 25, 2024Updated 2 years ago
speechio / asr-noises
View on GitHub
A handy dataset of noises for ASR
☆22May 29, 2019Updated 7 years ago
NickRuiz / power-asr
View on GitHub
Phonetically-Oriented Word Error Rate
☆36May 4, 2019Updated 7 years ago
revdotcom / words2num
View on GitHub
Convert words to numbers
☆21Apr 13, 2022Updated 4 years ago
JazminVidal / gop-ft
View on GitHub
Transfer learning approach to pronunciation scoring
☆12Jan 17, 2024Updated 2 years ago
dan-wells / kiss-aligner
View on GitHub
Simple Kaldi recipe for forced alignment
☆11Jul 16, 2023Updated 3 years ago
jtkim-kaist / end-point-detection
View on GitHub
☆10Sep 19, 2018Updated 7 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
luomingshuang / k2-speechbrain
View on GitHub
In this repository, I try to combine k2 with speechbrain to decode well and fastly.
☆16Jun 17, 2022Updated 4 years ago
Open-Speech-EkStep / crowdsource-dataplatform
View on GitHub
This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speech…
☆17Mar 6, 2023Updated 3 years ago
janson9192 / autokws2021
View on GitHub
☆13Mar 25, 2021Updated 5 years ago
RicherMans / UIT_Mobile
View on GitHub
Source Code for the Paper "UNIFIED KEYWORD SPOTTING AND AUDIO TAGGING ON MOBILE DEVICES WITH TRANSFORMERS"
☆24Mar 6, 2023Updated 3 years ago
Adibian / Persian-MultiSpeaker-Tacotron2
View on GitHub
Implementation of Transfer Learning from Speaker Verification to Multi-speaker Text-To-Speech Synthesis (SV2TTS) in Persian language.
☆13Oct 2, 2025Updated 9 months ago
JazminVidal / gop-pykaldi
View on GitHub
Goodness of Pronunciation algorithm using PyKaldi
☆19Jun 12, 2022Updated 4 years ago
asappresearch / multistream-cnn
View on GitHub
Multistream CNN for Robust Acoustic Modeling
☆40Jun 17, 2021Updated 5 years ago
desh2608 / kaldi-noise-vectors
View on GitHub
Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.
☆13Feb 13, 2021Updated 5 years ago
shiguredo / dtln-aec
View on GitHub
An echo cancellation library for browsers using DTLN-aec
☆26Oct 18, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Idlak / Living-Audio-Dataset
View on GitHub
A "Crowd-Built" continuously growing speech dataset with transcripts. The dataset contains multiple languages and is intended for anyone …
☆43Aug 3, 2022Updated 3 years ago
cadia-lvl / kaldi-speaker-diarization
View on GitHub
This repository creates speaker diarization recipes to be used within the egs folder of kaldi.
☆17Aug 12, 2024Updated last year
danijel3 / ClarinStudioKaldi
View on GitHub
A baseline Automatic Speech Recognition system for Polish based on Kaldi.
☆18Dec 21, 2021Updated 4 years ago
alumae / kiirkirjutaja
View on GitHub
☆58Jul 3, 2026Updated 3 weeks ago
idiap / inv-tn
View on GitHub
A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)
☆21Sep 27, 2017Updated 8 years ago
Speech-Lab-IITM / CCC-wav2vec-2.0
View on GitHub
Code for the method proposed in the paper:- ccc-wav2vec 2.0: Clustering aided Cross-Contrastive learning of Self-Supervised speech repres…
☆23Mar 18, 2024Updated 2 years ago
bshall / dusted
View on GitHub
DUSTED: Spoken-Term Discovery using Discrete Speech Units
☆17Oct 2, 2024Updated last year
fgnt / LatticeWordSegmentation
View on GitHub
Software to apply unsupervised word segmentation on lattices or text sequences using a nested hierarchical Pitman Yor language model
☆17Nov 24, 2016Updated 9 years ago
awasthiabhijeet / Error-Driven-ASR-Personalization
View on GitHub
Code for "Error-driven Fixed-Budget ASR Personalization for Accented Speakers" in ICASSP 2021
☆11Jun 13, 2021Updated 5 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
zhu-han / SpeechLLM
View on GitHub
LLM-based ASR recipe with Zipformer encoder and Qwen LLM
☆35Sep 25, 2025Updated 10 months ago
projecte-aina / oTranscribe-plus
View on GitHub
A free & open tool for transcribing audio interviews with offline ASR support
☆25Dec 21, 2023Updated 2 years ago
i3thuan5 / FaNT
View on GitHub
Filtering and Noise Adding Tool
☆29May 27, 2022Updated 4 years ago
qqueing / pytorch-G2P
View on GitHub
(semi) Grapheme-to-Phoneme (G2P) - seq2seq model using PyTorch for Korean
☆23Dec 17, 2017Updated 8 years ago
charlesliucn / LanMIT
View on GitHub
📖 LanMIT: A Toolkit for Improving Language Models in Low-resourced Speech Recognition based on Kaldi.
☆22Jul 12, 2019Updated 7 years ago
gullabi / STT-align
View on GitHub
Coqui STT (🐸STT) based forced alignment tool
☆13Feb 24, 2022Updated 4 years ago
mikex86 / DeepSpeech-Java-Bindings
View on GitHub
Java Bindings for the C++ library DeepSpeech
☆10Jun 4, 2020Updated 6 years ago