CouncilDataProject/speakerbox

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CouncilDataProject/speakerbox)

CouncilDataProject / speakerbox

Speakerbox: Fine-tune Audio Transformers for speaker identification.

☆61

Alternatives and similar repositories for speakerbox

Users that are interested in speakerbox are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hmohebbi / disentangling_representations
View on GitHub
☆14Oct 3, 2025Updated 9 months ago
bagustris / ssl-ser
View on GitHub
Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"
☆10Mar 15, 2023Updated 3 years ago
ductuantruong / speaker_age_estimation_ssl_study
View on GitHub
[APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models
☆14Oct 19, 2022Updated 3 years ago
m417z / the-old-new-thing-userscript
View on GitHub
Archived comments for old pages of The Old New Thing, a userscript to embed them and fix old links
☆15Feb 14, 2025Updated last year
makerjackie / tts-frontend-dataset
View on GitHub
TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization
☆104Feb 5, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
justinlovelace / SESD
View on GitHub
☆61Oct 28, 2024Updated last year
CVxTz / COLA_pytorch
View on GitHub
COLA contrastive pre-training method implemented in PyTorch
☆44Jan 27, 2021Updated 5 years ago
shinhyeokoh / rwen
View on GitHub
☆14Jun 16, 2023Updated 3 years ago
CherokeeLanguage / Cherokee-TTS
View on GitHub
Using Tacotron2 to do Cherokee Text to Speech
☆10May 10, 2022Updated 4 years ago
nii-yamagishilab / speaker_sex_attribute_privacy
View on GitHub
Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE
☆15Nov 30, 2022Updated 3 years ago
juice500ml / xlm_to_xlsr
View on GitHub
Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)
☆12Mar 12, 2024Updated 2 years ago
JaesungHuh / SimpleDiarization
View on GitHub
Simple diarization model
☆53Jun 13, 2025Updated last year
MTG / playlists-stat-analysis
View on GitHub
Tools for Analyzing Popularity and Semantic Diversity of a Playlist Dataset
☆10Jun 17, 2024Updated 2 years ago
TehreemFarooqi / Preparing-a-speech-recognition-dataset-using-YouTube-videos
View on GitHub
Using YouTube to prepare a speech recognition dataset for any language
☆10Mar 30, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
falabrasil / gitlab-resources
View on GitHub
This is a legacy repo. Dev occurs now on GitHub.
☆11Mar 28, 2021Updated 5 years ago
tonnetonne814 / PL-Bert-VITS2
View on GitHub
VITS2 using Phoneme-Level Japanese BERT
☆14Dec 17, 2023Updated 2 years ago
haideraltahan / CLAR
View on GitHub
☆18Apr 12, 2021Updated 5 years ago
vistec-AI / vistec-ser
View on GitHub
Speech Emotion Recognition using PyTorch sponsored by AIS and VISTEC-DEPA AIResearch Institute Thailand.
☆23Nov 6, 2021Updated 4 years ago
patrickvonplaten / Wav2Vec2_ParlanceCTCDecode
View on GitHub
☆11Nov 5, 2021Updated 4 years ago
bootphon / learnable-strf
View on GitHub
Learnable STRF, from Riad et al. 2021 JASA
☆13Aug 21, 2021Updated 4 years ago
NoerNova / ShanNLP
View on GitHub
Shan Natural Language Processing tools inspired by PythaiNLP
☆14Mar 1, 2026Updated 4 months ago
alumae / torch-xvectors-wav
View on GitHub
☆22Jun 30, 2021Updated 5 years ago
nii-yamagishilab / SpeechSPC-mini
View on GitHub
Speech Security and Privacy Compendium - Mini
☆10Jun 18, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
WangHelin1997 / Aty-TTS
View on GitHub
Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
☆11May 14, 2025Updated last year
Chung-I / youtube-asr-crawler
View on GitHub
☆10Sep 19, 2022Updated 3 years ago
30stomercury / hmm-backprop
View on GitHub
Fast and differentiable hidden Markov model in C++
☆19Jan 20, 2023Updated 3 years ago
motazsaad / ara-pronunciation-tool
View on GitHub
A python tool that converts Arabic diacritised text to a sequence of phonemes and creates a pronunciation dictionary. This code is based …
☆15Sep 5, 2017Updated 8 years ago
talhanai / kaldi-diar-latte
View on GitHub
steps to perform text-based speaker diarization with kaldi toolkit
☆12Nov 2, 2018Updated 7 years ago
yanghaha0908 / FastHuBERT
View on GitHub
Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning
☆100Nov 20, 2024Updated last year
xinjli / ucla-phonetic-corpus
View on GitHub
Dataset of ICASSP 2021 MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION
☆46May 12, 2023Updated 3 years ago
asteroid-team / Libri_VAD
View on GitHub
Script to generate VAD dataset used in Asteroid recipe
☆21Sep 30, 2021Updated 4 years ago
AkshathRaghav / tinyspeech
View on GitHub
Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"
☆23Jun 7, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
liuhuang31 / g2pw_once
View on GitHub
G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…
☆14Dec 30, 2023Updated 2 years ago
yuhangear / wenet-android
View on GitHub
☆13Oct 27, 2021Updated 4 years ago
poleval / 2021-punctuation-restoration
View on GitHub
PolEval 2021 Task 1
☆15Jun 28, 2022Updated 4 years ago
emirdemirel / DALI-TestSet4ALT
View on GitHub
This is a subset of the DALI set consisting of 240 polyphonic recordings that is used to benchmark lyrics transcription evaluation.
☆12Nov 30, 2021Updated 4 years ago
laurensw75 / docker-Kaldi-NL
View on GitHub
Docker for building an environment for Dutch online and offline ASR.
☆12Feb 2, 2021Updated 5 years ago
drscotthawley / fad_pytorch
View on GitHub
Frechet Audio Distance evaluation in PyTorch
☆36Jun 9, 2023Updated 3 years ago
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Oct 14, 2024Updated last year