jonatasgrosman / wav2vec2-sprint
☆175Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for wav2vec2-sprint
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆81Updated last year
- HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools☆432Updated last year
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆141Updated 6 months ago
- Wav2Vec for speech recognition, classification, and audio classification☆249Updated 2 years ago
- Segment an audio file and obtain utterance alignments. (Python package)☆321Updated 6 months ago
- ESPnet Model Zoo☆245Updated last year
- Variational Bayes HMM over x-vectors diarization☆254Updated 10 months ago
- Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three differen…☆207Updated 2 years ago
- Various speech datasets made available to the public☆99Updated last month
- Diarization scoring tools.☆220Updated last year
- This is the GitHub page for publicly available emotional speech data.☆322Updated 2 years ago
- Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpus☆165Updated 4 months ago
- CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus☆183Updated 2 years ago
- Wav2Keyword is keyword spotting(KWS) based on Wav2Vec 2.0. This model shows state-of-the-art in Speech commands dataset V1 and V2.☆100Updated last year
- Multilingual G2P in 100 languages☆288Updated last year
- This project is about performing Speaker diarization for Hindi Language.☆45Updated 3 years ago
- Kaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - P…☆187Updated 3 weeks ago
- Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO☆58Updated 2 years ago
- NeMo text processing for ASR and TTS☆285Updated this week
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆183Updated 2 months ago
- This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"☆111Updated 2 years ago
- Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.☆201Updated 10 months ago
- Matlab and Python libraries for an unsupervised method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised …☆128Updated 10 months ago
- Estimating the Age, Height, and Gender of a speaker with their speech signal. https://arxiv.org/pdf/2110.13653.pdf☆64Updated 3 years ago
- a simplified version of wav2vec(1.0, vq, 2.0) in fairseq☆132Updated 4 years ago
- Predicts the level of noise and reverberation on your audiofiles☆138Updated 5 months ago
- Spot the conversation: speaker diarisation in the wild☆123Updated 2 years ago
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆110Updated 2 years ago
- ☆63Updated last month
- Speech to Text with self-supervised learning based on wav2vec 2.0 framework using Hugging Face's Transformer☆30Updated 3 years ago