tomasJwYU/AutoPrepDemo

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/tomasJwYU/AutoPrepDemo)

tomasJwYU / AutoPrepDemo

AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data

☆36

Alternatives and similar repositories for AutoPrepDemo

Users that are interested in AutoPrepDemo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

chenpk00 / IS2024_stream_decoder_only_asr
View on GitHub
☆16Mar 12, 2024Updated 2 years ago
Liangzheng-ZL / BEdit-TTS
View on GitHub
Speech samples and code of BEdit-TTS
☆34Oct 8, 2023Updated 2 years ago
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
revsic / torch-retriever-vc
View on GitHub
PyTorch implementation of Retriever: Learning Content-Style Representation
☆12Jan 27, 2023Updated 3 years ago
ductuantruong / speaker_age_estimation_ssl_study
View on GitHub
[APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models
☆14Oct 19, 2022Updated 3 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
walker-hyf / NCSSD
View on GitHub
Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
☆61Nov 1, 2024Updated last year
0nutation / SpeechGPT2.github.io
View on GitHub
☆12Jul 23, 2024Updated 2 years ago
XXH333 / WordVoice-main
View on GitHub
The inference and trainging code for WordVoice.
☆54Jul 17, 2026Updated last week
lmxue / ICASSP2022_TTS_VC_Summary
View on GitHub
ICASSP2022 TTS&VC Summary
☆13Jun 9, 2022Updated 4 years ago
hhguo / SoCodec
View on GitHub
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
☆92Dec 20, 2024Updated last year
jiangyiqiao / Paper-Kalman
View on GitHub
some papers about Kalman Filter
☆16Sep 4, 2019Updated 6 years ago
FreedomIntelligence / S2S-Arena
View on GitHub
☆21Jun 4, 2026Updated last month
SpeechColab / GigaSpeech2
View on GitHub
An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement
☆197Apr 28, 2026Updated 2 months ago
X-LANCE / UniCATS-CTX-txt2vec
View on GitHub
[AAAI 2024] CTX-txt2vec, the acoustic model in UniCATS
☆64Nov 18, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
makerjackie / tts-frontend-dataset
View on GitHub
TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization
☆104Feb 5, 2024Updated 2 years ago
kamilakesbi / DiarizersLM
View on GitHub
☆15Jul 16, 2024Updated 2 years ago
kjw11 / Speaker-Aware-CTC
View on GitHub
Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.
☆22May 26, 2025Updated last year
juhayna-zh / BSRNN-speech-preprocess
View on GitHub
A solution to denoising and separating for two-speaker-mixed noisy speech, using a BSRNN inspired network.
☆15Aug 22, 2023Updated 2 years ago
modelscope / FunCodec
View on GitHub
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music gener…
☆445Jan 25, 2024Updated 2 years ago
Zain-Jiang / Speech-Editing-Toolkit
View on GitHub
It's a repository for implementations of neural speech editing algorithms.
☆206Jan 9, 2024Updated 2 years ago
pengzhendong / compute-wer
View on GitHub
Compute WER and SER for speech recognition evaluation
☆27Jun 6, 2026Updated last month
mborsdorf / UniversalSpeakerExtraction
View on GitHub
☆15Sep 6, 2021Updated 4 years ago
qinxiaoyi / Simple-Attention-Module-based-Speaker-Verification-with-Iterative-Noisy-Label-Detection
View on GitHub
☆12Jun 14, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
karthikbhamidipati / multi-task-speech-classification
View on GitHub
Multi-Task Speech classification of accent and gender of an english speaker on Mozilla's common voice dataset
☆28Jul 17, 2026Updated last week
Daisyqk / Automatic-Prosody-Annotation
View on GitHub
☆112Mar 9, 2026Updated 4 months ago
ddlBoJack / emotion2vec
View on GitHub
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training fo…
☆1,163Dec 23, 2024Updated last year
lifeiteng / TTS-TextAnalyzer
View on GitHub
TTS Text Analyzer
☆31Jul 20, 2023Updated 3 years ago
TomJwYu / WenetSpeechSpeakerCluster
View on GitHub
☆55Jul 17, 2023Updated 3 years ago
X-LANCE / SLAM-LLM
View on GitHub
A Framework for Speech, Language, Audio, Music Processing with Large Language Model
☆1,048Jan 15, 2026Updated 6 months ago
pzelasko / kaldialign
View on GitHub
Python wrappers for Kaldi Levenshtein's distance and alignment code.
☆70Jun 15, 2026Updated last month
yihuitang / StyleTTS_Mandarin
View on GitHub
Implementation of StyleTTS for Mandarin
☆11Jun 22, 2023Updated 3 years ago
WangHelin1997 / Automatic_Speech_Annotator
View on GitHub
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…
☆33Jun 14, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
mutiann / few-shot-transformer-tts
View on GitHub
Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.
☆87Jul 25, 2022Updated 3 years ago
ASLP-lab / OSUM
View on GitHub
OSUM & OSUM-EChat, open speech understanding model and empathetic spoken chatbot based on it, open-sourced by ASLP@NPU.
☆494Nov 23, 2025Updated 8 months ago
gemelo-ai / vocos
View on GitHub
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
☆1,143Aug 7, 2024Updated last year
JusperLee / TFACM
View on GitHub
☆23Jul 16, 2025Updated last year
zjwang21 / mix-phoneme-bert
View on GitHub
An unofficial PyTorch implementation of Mix-Phoneme-Bert
☆40Jul 10, 2023Updated 3 years ago
MiscellaneousStuff / PhoneLM
View on GitHub
(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.
☆48Sep 4, 2023Updated 2 years ago
ivanvovk / compressed-tacotron2-pytorch
View on GitHub
Compressed version of Tacotron 2 using Tensor Train + Waveglow.
☆22Dec 26, 2019Updated 6 years ago