Audio Diarization Annotation tool
☆30Nov 8, 2019Updated 6 years ago
Alternatives and similar repositories for audio_diarization_annotation
Users that are interested in audio_diarization_annotation are comparing it to the libraries listed below
Sorting:
- Transfer learning approach to pronunciation scoring☆12Jan 17, 2024Updated 2 years ago
- In this repository, I try to combine k2 with speechbrain to decode well and fastly.☆16Jun 17, 2022Updated 3 years ago
- ☆21Sep 24, 2018Updated 7 years ago
- wake word spotting with kaldi☆19Dec 3, 2020Updated 5 years ago
- Testing sets for semanticVAD☆20Feb 18, 2025Updated last year
- Implementation of StyleTTS for Mandarin☆11Jun 22, 2023Updated 2 years ago
- An open-source tool for automatic speech recognition ASR quality estimation.☆23Dec 12, 2019Updated 6 years ago
- This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speech…☆17Mar 6, 2023Updated 3 years ago
- Indonesian speech/phoneme recognizer powered by Kaldi 2.0 (lhotse, icefall, sherpa).☆15Jun 30, 2023Updated 2 years ago
- DSing ASR task: Resources and Baseline for an unaccompanied singing ASR.☆19Nov 23, 2021Updated 4 years ago
- This is application for dysarthria to improve their pronunciation by using deep learning☆10Dec 29, 2020Updated 5 years ago
- An upgrade framework for train and validate compare with icefall using Lightning.☆15Mar 26, 2025Updated 11 months ago
- ☆33Nov 27, 2021Updated 4 years ago
- Perform the forced decoding with target transcription☆11Sep 12, 2018Updated 7 years ago
- Speaker diarization based on Kaldi x-vectors, tuned for 16k microphone data☆95Jul 6, 2023Updated 2 years ago
- 2nd place solution for ID R&D Voice Antispoofing Challenge☆15Aug 22, 2019Updated 6 years ago
- VocalVerse: A powerful vocal evaluation framework powered by the Qwen LLMs☆43Jan 22, 2026Updated 2 months ago
- 📖 LanMIT: A Toolkit for Improving Language Models in Low-resourced Speech Recognition based on Kaldi.☆22Jul 12, 2019Updated 6 years ago
- ☆42Jun 25, 2018Updated 7 years ago
- Speech recognition module for Python, supporting several engines and APIs, online and offline.☆13Mar 9, 2022Updated 4 years ago
- scripts to align a given wave to its transcription using trained models by Kaldi☆36Aug 15, 2019Updated 6 years ago
- PyTorch implementation of TinyWASE described in our paper "Compressing Speaker Extraction Model with Ultra-low Precision Quantization and…☆11Jun 28, 2021Updated 4 years ago
- ☆11Jun 14, 2024Updated last year
- Keyword Search Recipe for Subword ASR☆30Jul 12, 2019Updated 6 years ago
- Java Bindings for the C++ library DeepSpeech☆10Jun 4, 2020Updated 5 years ago
- Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment☆13Feb 5, 2025Updated last year
- Sing any popular song with your voice☆11Jul 10, 2022Updated 3 years ago
- A toolkit for benchmarking on a wide variety of audio deepfake datasets.☆29Oct 9, 2025Updated 5 months ago
- ☆13Apr 14, 2024Updated last year
- FlowMirror-HydraVox — A natively accelerated multi-head autoregressive TTS system derived from CosyVoice 3.0. It predicts multiple tokens…☆50Feb 17, 2026Updated last month
- ☆13Oct 27, 2021Updated 4 years ago
- Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva☆91Feb 18, 2025Updated last year
- ☆25Jun 14, 2022Updated 3 years ago
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆18May 17, 2024Updated last year
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- Adaptive Multimodal Reasoning via Reinforcement Learning☆23Jan 11, 2026Updated 2 months ago
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆15Apr 7, 2025Updated 11 months ago
- SAMO: SPEAKER ATTRACTOR MULTI-CENTER ONE-CLASS LEARNING FOR VOICE ANTI-SPOOFING☆41Apr 5, 2023Updated 2 years ago
- Using OpenVINO to speed up MeloTTS inference☆15Nov 1, 2024Updated last year