shinshoji01 / MacST-project-pageLinks

This is the project page of our paper "MacST: Multi-Accent Speech Synthesis via Text Transliteration for Accent Conversion".

☆11

Alternatives and similar repositories for MacST-project-page

Users that are interested in MacST-project-page are comparing it to the libraries listed below

Sorting:

hmohebbi / disentangling_representations
☆12Updated 9 months ago
Taltt / FNSE-SBGAN
FNSE-SBGAN: Far-field Speech Enhancement with Schrödinger Bridge and Generative Adversarial Networks
☆13Updated 2 months ago
xiaoxue1117 / speech-mamba-public
☆11Updated 7 months ago
fgnt / speaker_reassignment
Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment
☆12Updated 5 months ago
interactiveaudiolab / emphases
Crowdsourced and Automatic Speech Prominence Estimation
☆21Updated last year
YoshikiMas / madeon-asr
[SLT'24] Mamba-based Decoder-Only Approach for Speech Recognition
☆14Updated 7 months ago
exercise-book-yq / FreeCodec
FREECODEC: A DISENTANGLED NEURAL SPEECH CODEC WITH FEWER TOKENS
☆21Updated 10 months ago
prairie-schooner / wav2vec-vc
☆11Updated 2 years ago
jh-cha-prml / JELLY
Code for the paper "JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis"
☆12Updated 8 months ago
pkufool / simple-wer
A simple command line tool to calculate WER for ASR.
☆14Updated 9 months ago
zjwang21 / mix-phoneme-bert
An unofficial PyTorch implementation of Mix-Phoneme-Bert
☆39Updated 2 years ago
mcf330 / efts2code
source code of EfficientTTS 2
☆16Updated last year
ogunlao / glowtts_stdp
Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor
☆18Updated 2 years ago
liuhuang31 / Megatts2_HierSpeechpp
Megatts2 use HierSpeechpp's vocoder
☆18Updated 7 months ago
zjzser / WMCodec
PyTorch Implementation of [WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification](https://arxiv.or…
☆14Updated 8 months ago
leto19 / WhiSQA
Whisper Speech Quality Assessment (WhiSQA)
☆10Updated 7 months ago
cheoljun95 / sdhubert
☆25Updated 7 months ago
Dapwner / CVAE-Tacotron
☆23Updated last year
ductuantruong / speaker_age_estimation_ssl_study
Official implementation of the APSIPA 2022 paper: Exploring Speaker Age Estimation on Different Self-Supervised Learning Models
☆14Updated 2 years ago
ydqmkkx / Respiro-en
Official implementation of paper: Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-…
☆27Updated 10 months ago
shang0712 / HierTTS
☆45Updated 2 years ago
Kevin-naticl / LLaSE
LLaSE: Maximizing Acoustic Preservation for LLaMA based Speech Enhancement
☆16Updated last week
bshall / dusted
DUSTED: Spoken-Term Discovery using Discrete Speech Units
☆17Updated 9 months ago
Liangzheng-ZL / BEdit-TTS
Speech samples and code of BEdit-TTS
☆33Updated last year
csalt-research / accented-codebooks-asr
☆18Updated 10 months ago
yangdongchao / ALMTokenizer2
The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…
☆26Updated last month
wyw97 / DENSE
ICASSP2025Dynamic Embedding Causal Target Speech Extraction
☆3Updated 4 months ago
AI-Unicamp / TTS-Objective-Metrics
Objective metrics used in several text-to-speech (TTS) papers.
☆49Updated last month
koudounasalkis / voc2vec
This repository contains the code for the paper "voc2vec: A Foundation Model for Non-Verbal Vocalization", accepted at ICASSP 2025.
☆37Updated 3 months ago
thuhcsi / SnakeGAN
Please visit https://thuhcsi.github.io/SnakeGAN/
☆37Updated 2 years ago