etri/kmsav

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/etri/kmsav)

etri / kmsav

☆14

Alternatives and similar repositories for kmsav

Users that are interested in kmsav are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fgnt / speaker_reassignment
View on GitHub
Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment
☆14Feb 5, 2025Updated last year
kh-kim / study_organizer
View on GitHub
Collection of Lectures, Articles, Slides and Papers for Deeplearning (+Machine Learning)
☆11Jan 12, 2017Updated 9 years ago
hyunokoh / SimpleMIPS
View on GitHub
☆15Mar 9, 2026Updated 4 months ago
kaistmm / voxceleb-disentangler
View on GitHub
[INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…
☆18Jul 23, 2024Updated 2 years ago
jagabandhumishra / W2V-E2E-Language-Diarization
View on GitHub
☆11Sep 4, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
JaesungHuh / av-diarization
View on GitHub
Audio-visual diarization pipeline used for creating VoxConverse dataset
☆22Jun 6, 2025Updated last year
Exgc / AVMuST-TED
View on GitHub
☆24Mar 30, 2024Updated 2 years ago
BUTSpeechFIT / diacorrect
View on GitHub
Error correction back-end for speaker diarization
☆18Sep 26, 2023Updated 2 years ago
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year
arranger1044 / DEBD
View on GitHub
A collection of commonly used datasets as benchmarks for density estimation in MaLe
☆20Jul 15, 2019Updated 7 years ago
Mu-Y / DiariST
View on GitHub
☆18Sep 19, 2023Updated 2 years ago
MoonJuhan / tistory-readme-stats
View on GitHub
Tistory Readme Stat Card
☆11Mar 27, 2024Updated 2 years ago
chimechallenge / chime-utils
View on GitHub
Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.
☆26Feb 25, 2025Updated last year
sungnyun / cav2vec
View on GitHub
(ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
☆16Apr 29, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Helw150 / levanter
View on GitHub
Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax
☆16Jun 16, 2024Updated 2 years ago
umbertocappellazzo / Omni-AVSR
View on GitHub
Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models" [IEEE ICASSP 202…
☆38Mar 10, 2026Updated 4 months ago
JusperLee / Gull-Codec-Training
View on GitHub
☆12Mar 11, 2025Updated last year
nervjack2 / Speech2Unit
View on GitHub
☆13Sep 25, 2024Updated last year
KevRiver / MSB
View on GitHub
Mad Square's Brawl is the 2D Android Platformer PVP game.
☆17Feb 15, 2023Updated 3 years ago
xiaoxiaomiao323 / MSA
View on GitHub
☆16Feb 19, 2026Updated 5 months ago
zcxu-eric / AVA-AVD
View on GitHub
☆51Nov 24, 2022Updated 3 years ago
JeongHun0716 / e-mvsr
View on GitHub
Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)
☆20Mar 17, 2025Updated last year
introlab / uimvdr
View on GitHub
☆13Oct 11, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
mushanshanshan / ESLTTS
View on GitHub
ESLTTS dataset
☆16Feb 6, 2025Updated last year
HaoFengyuan / EEND-IAAE
View on GitHub
The implementation of "End-to-End Neural Speaker Diarization with an Iterative Adaptive Attractor Estimation", which is accepted by Neura…
☆11Aug 27, 2023Updated 2 years ago
BUTSpeechFIT / DVBx
View on GitHub
Discriminative Training of VBx Diarization
☆28Sep 23, 2024Updated last year
cjchun3616 / zero_shot_gradtts
View on GitHub
zero_shot_gradtts
☆14Oct 23, 2023Updated 2 years ago
JeongHun0716 / MMS-LLaMA
View on GitHub
Official PyTorch implementation for "MMS-LLaMA: Efficient LLM-based Audio-Visual Speech Recognition with Minimal Multimodal Speech Tokens…
☆48Jun 12, 2025Updated last year
utter-project / mHuBERT-147-scripts
View on GitHub
Collection of scripts from mHuBERT-147.
☆35Nov 19, 2024Updated last year
IIP-Sogang / olkavs-avspeech
View on GitHub
The Introduction of the OLKAVS Dataset
☆39May 28, 2024Updated 2 years ago
kurianbenoy / whisper_normalizer
View on GitHub
A python package for whisper normalizer
☆79Updated this week
ArenAcikgoz / Whisper-Alignment
View on GitHub
Forced alignment decoder for Whisper.
☆16Mar 13, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
MontrealCorpusTools / kalpy
View on GitHub
Pybind11 bindings for Kaldi
☆15Jul 11, 2026Updated last week
alumae / torch-xvectors-wav
View on GitHub
☆22Jun 30, 2021Updated 5 years ago
ashi-ta / speechGLUE
View on GitHub
SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.
☆13Jun 2, 2023Updated 3 years ago
LAION-AI / Vocalino-V0.1-Voice-Acting-Pipeline
View on GitHub
Open-weights voice acting pipeline combining zero-shot voice cloning with natural-language direction. Provide a reference voice (or gener…
☆17May 25, 2026Updated last month
Bartelds / ctc-dro
View on GitHub
Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.
☆17May 16, 2025Updated last year
line / WaveTrainerFit
View on GitHub
Official implementation of "Wave-Trainer-Fit: Neural Vocoder with Trainable Prior and Fixed-Point Iteration towards High-Quality Speech G…
☆16Feb 6, 2026Updated 5 months ago
choijeongsoo / utut
View on GitHub
[TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation
☆31Sep 6, 2024Updated last year