dcaulley/av_diarization

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dcaulley/av_diarization)

dcaulley / av_diarization

AudioVisual Diarization - Supervised and Unsupervised

☆15

Alternatives and similar repositories for av_diarization

Users that are interested in av_diarization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

JaesungHuh / VoxSRC2021
View on GitHub
Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2021
☆19Jul 21, 2021Updated 5 years ago
SpeechColab / PySpeechColab
View on GitHub
A library of speech gadgets.
☆15Oct 15, 2022Updated 3 years ago
idiap / IBDiarization
View on GitHub
C++ Implementation of the Information Bottleneck System
☆22Jan 9, 2019Updated 7 years ago
ttslr / MonTTS
View on GitHub
☆16Dec 23, 2021Updated 4 years ago
albanie / LearningGrimacesByWatchingTV
View on GitHub
Code to accompany the paper "Learning Grimaces By Watching TV" and FaceValue dataset
☆12Aug 4, 2018Updated 7 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
MiuLab / Lattice-Transformer-SLU
View on GitHub
Source code for ASRU 2019 paper "Adapting Pretrained Transformer to Lattices for Spoken Language Understanding"
☆10Jul 8, 2020Updated 6 years ago
LilDevsy0117 / Ultra-Sortformer
View on GitHub
Ultra-Sortformer for Scalable Speaker Diarization
☆27Apr 9, 2026Updated 3 months ago
18573462816 / MEBCRN
View on GitHub
Deep learning network MEBCRN for separation of fat and water magnetic resonance images
☆11Dec 29, 2020Updated 5 years ago
fanlu / wenet
View on GitHub
Transformer based ASR Engine.
☆13Aug 23, 2021Updated 4 years ago
csukuangfj / icefall
View on GitHub
☆11Updated this week
dihardchallenge / dihard3_baseline
View on GitHub
☆30Jul 21, 2022Updated 4 years ago
thuhcsi / Contextual-Biasing-Dataset
View on GitHub
open-source Mandarian biased word dataset
☆14Sep 21, 2023Updated 2 years ago
WangHelin1997 / SpecAugment-plus
View on GitHub
A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification
☆34Jun 25, 2021Updated 5 years ago
pzelasko / kaldialign
View on GitHub
Python wrappers for Kaldi Levenshtein's distance and alignment code.
☆70Jun 15, 2026Updated last month
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
luomingshuang / k2-speechbrain
View on GitHub
In this repository, I try to combine k2 with speechbrain to decode well and fastly.
☆16Jun 17, 2022Updated 4 years ago
nryant / dscore
View on GitHub
Diarization scoring tools.
☆267Apr 8, 2026Updated 3 months ago
tango4j / llm_speaker_tagging
View on GitHub
SLT 2024 Challenge: Post-ASR-Speaker-Tagging
☆16Jun 16, 2024Updated 2 years ago
nii-yamagishilab / SpeechSPC-mini
View on GitHub
Speech Security and Privacy Compendium - Mini
☆10Jun 18, 2024Updated 2 years ago
wavlab-speech / cmu_multilingual_speech
View on GitHub
CMU multilingual speech repository
☆30Apr 15, 2022Updated 4 years ago
messiaen / full-lattice-search
View on GitHub
Full Text Search Over Probabilistic Lattices with Elasticsearch!
☆10Nov 20, 2020Updated 5 years ago
MobiSciLab / Baresip-DemoAudioCall
View on GitHub
This is a demo of using Baresip for audio call
☆10Jun 5, 2017Updated 9 years ago
rishikksh20 / NU-Wave2-pytorch
View on GitHub
NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]
☆25Jul 5, 2022Updated 4 years ago
idiap / inv-tn
View on GitHub
A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)
☆21Sep 27, 2017Updated 8 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
felixfuyihui / AISHELL-4
View on GitHub
☆140Jul 21, 2021Updated 5 years ago
tts-tutorial / icassp2022
View on GitHub
☆64May 23, 2022Updated 4 years ago
JoeHEZHAO / Spatiotemporal-Residual-Propagation
View on GitHub
Code release for ICCV 2019 paper "Spatiotemporal Feature Residual Propagation for Action Prediction"
☆14Sep 20, 2021Updated 4 years ago
worldarena / WorldArena
View on GitHub
the official repository of the WorldArena benchmark
☆15Mar 23, 2026Updated 3 months ago
usnistgov / F4DE
View on GitHub
Framework for Detection Evaluation (F4DE) : set of evaluation tools for detection evaluations and for specific NIST-coordinated evaluatio…
☆26Jul 6, 2017Updated 9 years ago
HeimingX / TAG
View on GitHub
Official code for Attention-driven GUI Grounding, AAAI2025
☆15Dec 17, 2024Updated last year
speechpro / mixup
View on GitHub
☆24Mar 13, 2020Updated 6 years ago
wngh1187 / RawNeXt
View on GitHub
Pytorch implementation of RawNeXt: Speaker verification system for variable-duration utterance with deep layer aggregation and dynamic sc…
☆25Jun 22, 2022Updated 4 years ago
staplesinLA / denoising_DIHARD18
View on GitHub
☆60Sep 26, 2020Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ronggong / mispronunciation-detection
View on GitHub
Mispronunciation detection code for jingju singing voice
☆19Sep 5, 2018Updated 7 years ago
kamperh / recipe_swbd_wordembeds
View on GitHub
☆22Mar 22, 2017Updated 9 years ago
welcheb / FattyRiot
View on GitHub
FattyRiot algorithm for separation of fat and water magnetic resonance images
☆14Nov 5, 2015Updated 10 years ago
edemattos / asr
View on GitHub
Automatic Speech Recognition at the University of Edinburgh.
☆16Mar 14, 2021Updated 5 years ago
AdolfVonKleist / RnnLMG2P
View on GitHub
Grapheme-to-Phoneme conversion with Joint-Sequence RnnLMs
☆30Dec 15, 2014Updated 11 years ago
thu-ml / LM-Calibration
View on GitHub
☆17May 31, 2023Updated 3 years ago
cornerfarmer / ctc_segmentation
View on GitHub
Segment a given audio into utterances using a trained end-to-end ASR model.
☆75Oct 9, 2020Updated 5 years ago