jonatasgrosman / wav2vec2-sprint
☆187Updated 3 years ago
Alternatives and similar repositories for wav2vec2-sprint:
Users that are interested in wav2vec2-sprint are comparing it to the libraries listed below
- HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools☆445Updated last year
- Various speech datasets made available to the public☆114Updated 3 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆146Updated 10 months ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆98Updated last month
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆81Updated last year
- ASRecognition: just an easy-to-use library for Automatic Speech Recognition.☆51Updated 2 years ago
- Variational Bayes HMM over x-vectors diarization☆266Updated last year
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at…☆406Updated last month
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆75Updated 3 years ago
- Spot the conversation: speaker diarisation in the wild☆137Updated 2 years ago
- ☆66Updated 3 months ago
- Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpus☆170Updated 3 months ago
- Some comprehensive papers about speaker diarization☆267Updated last month
- Diarization scoring tools.☆240Updated 2 years ago
- This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"☆117Updated 2 years ago
- Wav2Vec for speech recognition, classification, and audio classification☆261Updated 2 years ago
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆156Updated last year
- Large, modern dataset for speech recognition☆669Updated last year
- A lightweight library to compute Diarization Error Rate (DER).☆59Updated last year
- Voice Activity Detection (VAD) using deep learning.☆194Updated 5 years ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆82Updated last year
- UniSpeech - Large Scale Self-Supervised Learning for Speech☆453Updated 11 months ago
- Predicts the level of noise and reverberation on your audiofiles☆148Updated 10 months ago
- Speaker embedding (d-vector) trained with GE2E loss☆278Updated last year
- ESPnet Model Zoo☆247Updated last year
- [deprecated] Pretrained models for pyannote-audio 1.x☆72Updated 2 years ago
- ☆91Updated 2 years ago
- ☆39Updated last year
- A collection of datasets for the purpose of emotion recognition/detection in speech.☆316Updated 5 months ago
- Multilingual G2P in 100 languages☆311Updated last year