miguelcollette / audio_clusteringLinks

unsupervised clustering of speech / music, or genres of music

☆9

Alternatives and similar repositories for audio_clustering

Users that are interested in audio_clustering are comparing it to the libraries listed below

Sorting:

pragyak412 / Improving-Voice-Separation-by-Incorporating-End-To-End-Speech-Recognition
Implementing the paper -
☆19Updated last year
Ephrem-ETH / E2E-KWS
End-to-End Keyword Spotting (E2E-KWS) using a character level LSTM
☆39Updated 2 years ago
HaoFengyuan / X-TF-GridNet
The implementation of "X-TF-GridNet: A Time-Frequency Domain Target Speaker Extraction Network with Adaptive Speaker Embedding Fusion", w…
☆58Updated 7 months ago
jyhan03 / channel-decorrelation
multi-channel target speech extraction with channel decorrelation and target speaker adaptation
☆25Updated 4 years ago
JorisCos / VCTK-2Mix
☆16Updated 4 years ago
Jasson-Chen / Add_noise_and_rir_to_speech
The purpose of this code base is to add a specified signal-to-noise ratio noise from MUSAN dataset to a pure speech signal and to generat…
☆29Updated 3 years ago
daniel03c1 / NAS_VAD
☆25Updated 7 months ago
iariav / End-to-End-VAD
an Audio-Visual Voice Activity Detection using Deep Learning
☆49Updated 6 years ago
lin9x / AV-Sepformer
☆52Updated last year
yuguochencuc / DB-AIAT
The implementation of "Dual-branch Attention-In-Attention Transformer for single-channel speech enhancement"
☆121Updated 2 years ago
tvuong123 / ModulationDomainLoss
Official repo for "A MODULATION-DOMAIN LOSS FOR NEURAL-NETWORK-BASED REAL-TIME SPEECH ENHANCEMENT" to appear in ICASSP 2021
☆39Updated 3 years ago
yucongzh / online_speaker_diarization
☆14Updated 2 years ago
Windstudent / Complex-MTASSNet
Multi-Task Audio Source Separation, Two-Stage Model, Complex Domain.
☆93Updated last year
Ryuk17 / noise-xorcist
Single Channel Speech Enhancement Methods and Toolbox
☆30Updated 3 months ago
desh2608 / diarizer
Clustering-based methods for overlapping diarization
☆81Updated last year
mborsdorf / UniversalSpeakerExtraction
☆14Updated 3 years ago
Andong-Li-speech / DARCN
The implementation of "A Recursive Network with Dynamic Attention for Monaural Speech Enhancement"
☆77Updated 2 years ago
xuchenglin28 / speech_separation
Constrained Permutation Invariant Training, Speech Separation
☆47Updated 4 years ago
yuzhou-git / deep-casa
Tensorflow implementation of deep CASA
☆65Updated 4 years ago
RicherMans / SAT
Streaming Audiotransformers for online Audio tagging
☆44Updated 11 months ago
mispchallenge / MISP2021-AVSR
repository for paper "Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis"
☆16Updated 2 years ago
chenchy / D3Net
A pytorch implementation of D3Net.
☆11Updated 3 years ago
popcornell / SparseLibriMix
☆59Updated 4 years ago
microsoft / SIG-Challenge
☆81Updated 11 months ago
nii-yamagishilab / Attention_Backend_for_ASV
Attention Backend for Aotumatic Speaker Verification with Multiple Enrollment Utterances
☆49Updated 2 years ago
echocatzh / conv-stft
A STFT/iSTFT written up in PyTorch using 1D Convolutions
☆28Updated 10 months ago
marc-moreaux / audioset_raw
Download and create a tfreader for the audioset dataset
☆16Updated 5 years ago
mispchallenge / misp2022_baseline
☆30Updated last year
YunyangZeng / TAPLoss
☆65Updated last year
urgent-challenge / urgent2025_challenge
Official data preparation and metric evaluation scripts for the Interspeech 2025 URGENT challenge.
☆53Updated 2 weeks ago