WiraDKP / pytorch_speaker_embedding_for_diarization
Using speaker embedding for diarization in PyTorch
☆18Updated 4 years ago
Alternatives and similar repositories for pytorch_speaker_embedding_for_diarization:
Users that are interested in pytorch_speaker_embedding_for_diarization are comparing it to the libraries listed below
- Transformer implementation speciaized in speech recognition tasks using Pytorch.☆64Updated 3 years ago
- Clustering-based methods for overlapping diarization☆78Updated last year
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆75Updated 3 years ago
- Speaker change detection using SincNet and an LSTM/Transformer☆48Updated 8 months ago
- ☆35Updated 2 weeks ago
- Rescoring methods for end-to-end Automatic Speech Recognition☆27Updated 4 years ago
- A lightweight library to compute Diarization Error Rate (DER).☆59Updated last year
- PyTorch based speaker embedding model☆15Updated 11 months ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆50Updated 2 years ago
- ☆38Updated 3 years ago
- A pakage for crawling audio from Youtube☆41Updated last year
- Code for the Paper Speech Recognition and Multi-Speaker Diarization of Long Conversations☆37Updated last year
- A PyTorch implementation of End-to-End Neural Diarization☆104Updated last year
- Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP…☆59Updated 4 years ago
- ☆54Updated last year
- [Interspeech22]Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Ass…☆27Updated last year
- A better, faster, stronger version of the unbounded interleaved-state recurrent neural network (UIS-RNN)☆61Updated 4 years ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆81Updated last year
- A list of papers for child ASR☆38Updated 5 months ago
- ☆14Updated last year
- PyTorch implementation of RNN-Transducer(RNN-T).☆75Updated 3 years ago
- This is the source code of the paper "Neural grapheme-to-phoneme conversion with pretrained grapheme models☆46Updated 3 years ago
- Segment a given audio into utterances using a trained end-to-end ASR model.☆73Updated 4 years ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆82Updated last year
- Code for the method proposed in the paper:- ccc-wav2vec 2.0: Clustering aided Cross-Contrastive learning of Self-Supervised speech repres…☆21Updated last year
- BERT and LSTM baseline models of the ZeroSpeech Challenge 2021☆57Updated 2 years ago
- pytorch implementation for MultiSpeech: Multi-Speaker Text to Speech with Transformer paper☆22Updated 2 years ago
- An implementation of RNN-Transducer loss in TF-2.0.☆45Updated 2 years ago
- Pytorch implementation of Generalized End-to-End Loss for speaker verification☆84Updated 5 years ago
- This repository describes our reproducible framework for assessing self-supervised representation learning from speech☆51Updated 3 years ago