Speakerbox: Fine-tune Audio Transformers for speaker identification.
☆59Dec 1, 2024Updated last year
Alternatives and similar repositories for speakerbox
Users that are interested in speakerbox are comparing it to the libraries listed below
Sorting:
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Mar 15, 2023Updated 2 years ago
- PolEval 2021 Task 1☆15Jun 28, 2022Updated 3 years ago
- Fast and differentiable hidden Markov model in C++☆19Jan 20, 2023Updated 3 years ago
- Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)☆12Mar 12, 2024Updated last year
- ☆12Jun 10, 2021Updated 4 years ago
- Crawling and creating a German language model resource☆18Aug 23, 2022Updated 3 years ago
- ☆17Oct 16, 2018Updated 7 years ago
- ☆15Jul 11, 2022Updated 3 years ago
- [APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Oct 19, 2022Updated 3 years ago
- [EMNLP 2025 Findings] Official code for EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion☆34Sep 9, 2025Updated 5 months ago
- Artie Bias Corpus: an audio corpus + code for detecting demographic bias☆20Jul 21, 2020Updated 5 years ago
- [ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…☆32Apr 8, 2022Updated 3 years ago
- ☆22Jun 30, 2021Updated 4 years ago
- Tool to aid in the creation of mashups☆19Apr 7, 2020Updated 5 years ago
- MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.☆22Apr 8, 2021Updated 4 years ago
- DSing ASR task: Resources and Baseline for an unaccompanied singing ASR.☆19Nov 23, 2021Updated 4 years ago
- Automatic Dialect Detection Repository☆39Nov 13, 2022Updated 3 years ago
- An open-source tool for automatic speech recognition ASR quality estimation.☆23Dec 12, 2019Updated 6 years ago
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆11Oct 21, 2022Updated 3 years ago
- Dataset of ICASSP 2021 MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION☆46May 12, 2023Updated 2 years ago
- Accompanying repository for the paper "DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions"☆38Oct 28, 2025Updated 4 months ago
- COLA contrastive pre-training method implemented in PyTorch☆43Jan 27, 2021Updated 5 years ago
- Workflow for forced alignment between languages☆23Jan 13, 2026Updated last month
- ☆21Sep 24, 2018Updated 7 years ago
- A corpus of diacritized Hebrew texts (טקסט מנוקד)☆11May 4, 2022Updated 3 years ago
- Docker for building an environment for Dutch online and offline ASR.☆12Feb 2, 2021Updated 5 years ago
- ☆10Dec 22, 2023Updated 2 years ago
- [INTERSPEECH 2024] Official code for VoxSim: A perceptual voice similarity dataset☆12Sep 29, 2025Updated 5 months ago
- offical code for Dense-TSNet☆12Sep 17, 2024Updated last year
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆10Sep 30, 2024Updated last year
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 10 months ago
- Using YouTube to prepare a speech recognition dataset for any language☆10Mar 30, 2021Updated 4 years ago
- ☆11Nov 5, 2021Updated 4 years ago
- Scripts for recreating the Replication Dataset for Fundamental Frequency Estimation. Part of the dissertation "Pitch of Voiced Speech in …☆11Mar 29, 2021Updated 4 years ago
- Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE☆15Nov 30, 2022Updated 3 years ago
- Speech Security and Privacy Compendium - Mini☆10Jun 18, 2024Updated last year
- REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR☆14Dec 11, 2024Updated last year
- ☆13Oct 3, 2025Updated 5 months ago
- Unsupervised speech activity detection system.☆11Jul 2, 2018Updated 7 years ago