C3Imaging / whisper_child_asrLinks

☆11

Alternatives and similar repositories for whisper_child_asr

Users that are interested in whisper_child_asr are comparing it to the libraries listed below

Sorting:

standing-o / Combined_Dataset_for_Speech_Emotion_Recognition
A collection of dataset consists of a total of 8 English speech datasets for SER
☆30Updated last year
DongKeon / Awesome-Speaker-Diarization
Some comprehensive papers about speaker diarization
☆327Updated 7 months ago
usc-sail / child-adult-diarization
public child-adult speaker diarization/classification model and codes
☆17Updated 8 months ago
BUTSpeechFIT / VBx
Variational Bayes HMM over x-vectors diarization
☆281Updated last year
BUTSpeechFIT / EEND
☆91Updated 8 months ago
halsay / ASR-TTS-paper-daily
Update ASR paper everyday
☆426Updated this week
sarulab-speech / UTMOS22
UT-Sarulab MOS prediction system using SSL models
☆291Updated last year
Audio-WestlakeU / FS-EEND
The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …
☆160Updated last month
emo-box / EmoBox
[INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark
☆304Updated 9 months ago
wavlab-speech / versa
Versatile Evaluation of Speech and Audio
☆373Updated last month
Xflick / EEND_PyTorch
A PyTorch implementation of End-to-End Neural Diarization
☆109Updated 2 years ago
IDRnD / redimnet
The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"
☆185Updated 3 months ago
juice500ml / dysarthria-mtl
Official implementation of the paper "Automatic Severity Assessment of Dysarthric speech by using Self-supervised Model with Multi-task L…
☆11Updated last year
asvspoof-challenge / asvspoof5
☆62Updated last year
SpeechColab / GigaSpeech2
An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement
☆180Updated 4 months ago
lifeiteng / naturalspeech3_facodec
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3
☆230Updated last year
nryant / dscore
Diarization scoring tools.
☆260Updated 2 years ago
felixbur / nkululeko
Machine learning speaker characteristics
☆41Updated last month
AudioLLMs / AudioBench
AudioBench: A Universal Benchmark for Audio Large Language Models
☆291Updated 6 months ago
Takaaki-Saeki / DiscreteSpeechMetrics
Reference-aware automatic speech evaluation toolkit
☆176Updated last year
sarulab-speech / UTMOSv2
UTokyo-SaruLab MOS Prediction System
☆283Updated 3 weeks ago
thuhcsi / SpeechCraft
The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.
☆177Updated 8 months ago
ankitapasad / layerwise-analysis
Layer-wise analysis of self-supervised pre-trained speech representations
☆122Updated last year
wenet-e2e / wesep
Target Speaker Extraction Toolkit
☆238Updated 3 months ago
MatthewCYM / VoiceBench
VoiceBench: Benchmarking LLM-Based Voice Assistants
☆316Updated last month
nttcslab-sp / EEND-vector-clustering
This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-cl…
☆77Updated 3 years ago
YuanGongND / ltu
Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".
☆464Updated last year
JishengBai / AudioSetCaps
A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline
☆193Updated last year
mkunes / w2v2_audioFrameClassification
wav2vec2 audio classification for prosodic boundary detection and other tasks
☆42Updated 2 years ago
dynamic-superb / dynamic-superb
The official repository of Dynamic-SUPERB.
☆198Updated 6 months ago