lumaku / ctc-segmentation
Segment an audio file and obtain utterance alignments. (Python package)
☆319Updated 4 months ago
Related projects: ⓘ
- Diarization scoring tools.☆213Updated last year
- Variational Bayes HMM over x-vectors diarization☆251Updated 8 months ago
- A fast and lightweight python-based CTC beam search decoder for speech recognition.☆421Updated last year
- Multilingual G2P in 100 languages☆274Updated last year
- Onnx wrapper for espnet infrernce model☆152Updated 2 months ago
- A library for speech data augmentation in time-domain☆635Updated 3 years ago
- Grapheme to phoneme conversion with deep learning.☆346Updated 9 months ago
- UniSpeech - Large Scale Self-Supervised Learning for Speech☆419Updated 5 months ago
- End-to-End Neural Diarization☆367Updated 3 years ago
- Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text☆229Updated 4 years ago
- A Survey on Neural Speech Synthesis https://arxiv.org/pdf/2106.15561.pdf☆359Updated 2 years ago
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at…☆345Updated this week
- Spot the conversation: speaker diarisation in the wild☆119Updated 2 years ago
- A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation☆508Updated last year
- Charsiu: A neural phonetic aligner.☆267Updated 2 years ago
- Speaker embedding (d-vector) trained with GE2E loss☆270Updated 8 months ago
- Large, modern dataset for speech recognition☆629Updated 6 months ago
- Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpus☆160Updated 2 months ago
- ESPnet Model Zoo☆242Updated last year
- Towards hot directions in industrial end to end speech recognition☆324Updated 2 years ago
- An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-S…☆375Updated last year
- A pure python module for reading and writing kaldi ark files☆248Updated last year
- Word alignments generated by the Montreal Forced Aligner for the Librispeech dataset☆147Updated 5 years ago
- DeepSpeech based forced alignment tool☆232Updated 3 years ago
- This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsuperv…☆126Updated 3 months ago
- A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems☆184Updated last year
- Matlab and Python libraries for an unsupervised method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised …☆127Updated 7 months ago
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆138Updated last year
- [NeurIPS'22] Squeezeformer: An Efficient Transformer for Automatic Speech Recognition☆245Updated last year
- Predicts the level of noise and reverberation on your audiofiles☆134Updated 3 months ago