CouncilDataProject / speakerbox
Speakerbox: Fine-tune Audio Transformers for speaker identification.
☆52Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for speakerbox
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆81Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆44Updated 4 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆71Updated last year
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated 9 months ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- Clustering-based methods for overlapping diarization☆70Updated 10 months ago
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆19Updated 2 months ago
- 56 language, 1 model Multilingual ASR☆24Updated 3 years ago
- ☆49Updated 9 months ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆47Updated last year
- ☆16Updated 3 years ago
- ☆23Updated last year
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆39Updated 3 months ago
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆71Updated 3 years ago
- ☆56Updated last year
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Updated 2 years ago
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Updated last year
- Deep Speech Distances PyTorch☆27Updated 2 years ago
- ☆40Updated last year
- 🎹 pyannote + 🗒 notebook = pyannotebook☆25Updated last year
- ☆10Updated last year
- Adapting a ConvNeXt model to audio classification on AudioSet☆19Updated last year
- The VoxTube dataset official repository☆61Updated 9 months ago
- INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!☆32Updated last year
- Code for the Paper Speech Recognition and Multi-Speaker Diarization of Long Conversations☆36Updated last year
- ☆59Updated last year
- NOTSOFAR-1 Challenge: Distant Diarization and ASR☆44Updated last week
- Collection of scripts from mHuBERT-147.☆22Updated this week
- NSNet2 Deep Noise Suppression (DNS) package☆31Updated 2 years ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆46Updated 5 months ago