lstrgar / self-supervised-phone-segmentationLinks

Phoneme segmentation using pre-trained speech models

☆55

Alternatives and similar repositories for self-supervised-phone-segmentation

Users that are interested in self-supervised-phone-segmentation are comparing it to the libraries listed below

Sorting:

xinjli / alqalign
multilingual speech aligner
☆76Updated 2 years ago
nii-yamagishilab / VCC2020-database
☆53Updated 5 years ago
prosodylab / prosobeast-annotation-tool
☆40Updated 3 years ago
vectominist / spin
Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…
☆63Updated 2 years ago
thuhcsi / NeuFA
Neural network-based forced alignment with bidirectional attention mechanism
☆78Updated last year
spring-media / DeepForcedAligner
☆80Updated 6 months ago
b04901014 / FG-transformer-TTS
Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis.
☆89Updated 3 years ago
soumimaiti / speechlmscore_tool
☆32Updated last year
zerospeech / zerospeech2021_baseline
BERT and LSTM baseline models of the ZeroSpeech Challenge 2021
☆60Updated 3 years ago
interactiveaudiolab / ppgs
High-Fidelity Neural Phonetic Posteriorgrams
☆122Updated 11 months ago
ga642381 / SpeechPrompt
**Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…
☆102Updated 10 months ago
kan-bayashi / LibriTTSLabel
Alignment files of LibriTTS.
☆67Updated 5 years ago
keonlee9420 / Daft-Exprt
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis
☆55Updated 4 years ago
desh2608 / pytorch-tdnn
Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training
☆41Updated 5 years ago
RF5 / simple-asgan
Training code and trained checkpoints for ASGAN.
☆62Updated 2 years ago
thuhcsi / icassp2021-emotion-tts
Please visit: https://thuhcsi.github.io/icassp2021-emotion-tts/
☆34Updated 2 years ago
cageyoko / CTC-Attention-Mispronunciation
A Full Text-Dependent End to End Mispronunciation Detection and Diagnosis with Easy Data Augment Techniques
☆63Updated 4 years ago
CSTR-Edinburgh / qualtreats
Qualtric or Qualtreat? Generate Qualtrics listening tests for Text-To-Speech evaluations.
☆36Updated last year
RF5 / transfusion-asr
Transcribing Speech with Multinomial Diffusion, training code and models.
☆80Updated 2 years ago
bigpon / SpeechSubjectiveTest
Speech (audio) subjective evaluation system
☆42Updated 5 years ago
Daisyqk / Automatic-Prosody-Annotation
☆111Updated 3 years ago
aispeech-lab / w2v-cif-bert
☆37Updated 4 years ago
desh2608 / diarizer
Clustering-based methods for overlapping diarization
☆82Updated 2 years ago
ga642381 / SpeechPrompt-v2
《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm
☆82Updated 2 years ago
JSALT-2022-SSL / superb-prosody
☆31Updated 2 years ago
sky1456723 / Pytorch-MBNet
A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK
☆61Updated 4 years ago
kamperh / vqwordseg
Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.
☆39Updated last year
BenoitWang / Speech_Emotion_Diarization
☆70Updated last year
b04901014 / UUVC
Official implementation for the paper: A Unified One-Shot Prosody and Speaker Conversion System with Self-Supervised Discrete Speech Unit…
☆83Updated 3 years ago
tts-tutorial / icassp2022
☆64Updated 3 years ago