alefiury / Whisper-Classification-SERLinks

☆8

Alternatives and similar repositories for Whisper-Classification-SER

Users that are interested in Whisper-Classification-SER are comparing it to the libraries listed below

Sorting:

flamedtts / Flamed-TTS
This repository implement a novel zero-shot TTS framework, named Flamed-TTS, focusing on the efficient generation and dynamic pacing in …
☆39Updated this week
ArenAcikgoz / Whisper-Alignment
Forced alignment decoder for Whisper.
☆14Updated last year
ductuantruong / speaker_age_estimation_ssl_study
Official implementation of the APSIPA 2022 paper: Exploring Speaker Age Estimation on Different Self-Supervised Learning Models
☆14Updated 2 years ago
audiodemo / voice-conversion
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Updated last year
reppy4620 / convnext_tts
Unofficial implementation of ConvNeXt-TTS powered by lightning
☆17Updated 9 months ago
huutuongtu / Lightvoc
LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM
☆18Updated last year
WangHelin1997 / Aty-TTS
Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
☆10Updated 2 months ago
mbzuai-nlp / sttatts
☆26Updated 9 months ago
liuhuang31 / HiFTNet-sr
HiFTNet wav/audio super-resolution 16/24 kHz to 48 kHz
☆24Updated last year
ShoukanLabs / Vokan
The Vokan Architecture (Tsukasa speech based)
☆10Updated 6 months ago
shivammehta25 / BetterFastSpeech2
Just another FastSpeech 2 but cleaner code :)
☆26Updated last year
iamanigeeit / present
☆13Updated 11 months ago
adelacvg / DPTTS
An AR+AR TTS attempt.
☆16Updated 6 months ago
miccio-dk / NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
☆16Updated 3 years ago
p1an-lin-jung / wv_tts
☆19Updated last year
Zhongxu-Wang / ArtSpeech
ArtSpeech: Adaptive Text-to-Speech Synthesis with Articulatory Representations
☆19Updated 3 months ago
jisang93 / VISinger
Unofficial pytorch implementation of VISinger: Variational Inference with Adversarial Learning for End-to-end Singing Voice Synthesis (IC…
☆15Updated 2 years ago
ryota-komatsu / speech_resynth
Speech Resynthesis and Language Modeling
☆25Updated 2 months ago
v-nhandt21 / MusicVoiceConversion
Sing any popular song with your voice
☆11Updated 3 years ago
rishikksh20 / NU-Wave2-pytorch
NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]
☆25Updated 3 years ago
ex3ndr / supervoice-hybrid
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
☆26Updated last year
Jackson-Kang / Prosody-augmentation-for-Text-to-speech
Simple tool for speech dataset augmentation for modeling various prosodies.
☆14Updated 4 years ago
lwang114 / UnsupTTS
☆37Updated last year
Aria-K-Alethia / laughter-synthesis
Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…
☆76Updated 2 years ago
ogunlao / glowtts_stdp
Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor
☆18Updated 2 years ago
OlaWod / PitchVC
PitchVC: Pitch Conditioned Any-to-Many Voice Conversion
☆34Updated last year
Top34051 / stargan-zsvc
Unofficial PyTorch Implementation of StarGAN-ZSVC
☆14Updated 4 years ago
ShovalMessica / NAST
Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…
☆46Updated last year
archinetai / aligner-pytorch
Sequence alignement methods with helpers for PyTorch.
☆24Updated 2 years ago
AI-S2-Lab / GPT-Talker
[ACMMM'2024] Generative Expressive Conversational Speech Synthesis
☆37Updated 9 months ago