WiraDKP / pytorch_speaker_embedding_for_diarization
Using speaker embedding for diarization in PyTorch
☆16Updated 4 years ago
Related projects: ⓘ
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆71Updated 2 years ago
- Rescoring methods for end-to-end Automatic Speech Recognition☆27Updated 3 years ago
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆9Updated last year
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆47Updated last year
- Emotion detection in audio utilising self-supervised representations trained with Contrastive Predictive Coding (CPC).☆41Updated 2 years ago
- ☆37Updated 3 years ago
- Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together☆42Updated last year
- This repository describes our reproducible framework for assessing self-supervised representation learning from speech☆51Updated 2 years ago
- Compendium for the paper "Transparent pronunciation scoring using articulatorily weighted phoneme edit distance" by Karhila, Smolander, Y…☆24Updated 5 years ago
- Implementation of the paper "BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition"☆17Updated 3 years ago
- Clustering-based methods for overlapping diarization☆68Updated 8 months ago
- ☆69Updated this week
- Estimating the Age, Height, and Gender of a speaker with their speech signal. https://arxiv.org/pdf/2110.13653.pdf☆57Updated 3 years ago
- Making Espnet easier to use☆51Updated 3 years ago
- ☆32Updated 3 years ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- BERT and LSTM baseline models of the ZeroSpeech Challenge 2021☆57Updated last year
- ☆38Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆39Updated 2 months ago
- This is the source code of the paper "Neural grapheme-to-phoneme conversion with pretrained grapheme models☆44Updated 2 years ago
- Code for the Paper Speech Recognition and Multi-Speaker Diarization of Long Conversations☆36Updated last year
- PyTorch based speaker embedding model☆15Updated 5 months ago
- follow NVIDIA, simplify it and support data parallel.☆13Updated 4 years ago
- Code and data repository for paper "VoxCeleb enrichment for Age and Gender recognition" submitted at ASRU 2021☆63Updated 2 years ago
- Code for AccentDB.☆20Updated 3 years ago
- Multilingual and code-switching ASR challenges for low resource Indian languages.☆20Updated 3 years ago
- A new metric for evaluating end-to-end speech recognition and disfluency removal systems☆19Updated 3 years ago
- An implementation of RNN-Transducer loss in TF-2.0.☆45Updated last year
- Prosodic Speech Segmentation with Transformers☆22Updated 6 months ago
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆97Updated last year