audeering/w2v2-how-to

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/audeering/w2v2-how-to)

audeering / w2v2-how-to

How to use our public wav2vec2 dimensional emotion model

☆555

Alternatives and similar repositories for w2v2-how-to

Users that are interested in w2v2-how-to are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

innnky / emotional-vits
View on GitHub
无需情感标注的情感可控语音合成模型，基于VITS
☆1,392Mar 30, 2023Updated 3 years ago
ddlBoJack / emotion2vec
View on GitHub
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training fo…
☆1,158Dec 23, 2024Updated last year
b04901014 / FT-w2v2-ser
View on GitHub
Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition
☆152Oct 26, 2021Updated 4 years ago
Choddeok / EmoSphere-TTS
View on GitHub
[INTERSPEECH 2024] The official implementation of EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for …
☆181Updated this week
b04901014 / MQTTS
View on GitHub
☆260May 15, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
HLTSingapore / Emotional-Speech-Data
View on GitHub
This is the GitHub page for publicly available emotional speech data.
☆401Jan 6, 2022Updated 4 years ago
KunZhou9646 / Mixed_Emotions
View on GitHub
☆123Oct 24, 2022Updated 3 years ago
MasayaKawamura / MB-iSTFT-VITS
View on GitHub
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
☆469Nov 17, 2022Updated 3 years ago
KunZhou9646 / Emovox
View on GitHub
This is the implementation of the paper "Emotion Intensity and its Control for Emotional Voice Conversion".
☆95Feb 9, 2022Updated 4 years ago
SuperKogito / SER-datasets
View on GitHub
A collection of datasets for the purpose of emotion recognition/detection in speech.
☆420Sep 30, 2024Updated last year
lifeiteng / naturalspeech3_facodec
View on GitHub
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3
☆252Apr 20, 2024Updated 2 years ago
jishengpeng / ControlSpeech
View on GitHub
[ACL 2025 Main] ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec
☆276Nov 22, 2024Updated last year
habla-liaa / ser-with-w2v2
View on GitHub
Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'
☆140Jan 6, 2025Updated last year
ETZET / SpeechEmotionAVLearning
View on GitHub
☆13Nov 25, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
thuhcsi / VoxInstruct
View on GitHub
VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
☆100Nov 9, 2024Updated last year
hs-oh-prml / DurFlexEVC
View on GitHub
☆81Jan 22, 2025Updated last year
keonlee9420 / Cross-Speaker-Emotion-Transfer
View on GitHub
PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised T…
☆194Nov 9, 2022Updated 3 years ago
line / LibriTTS-P
View on GitHub
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
☆161Jun 13, 2024Updated 2 years ago
3loi / NaturalVoices
View on GitHub
☆61Oct 22, 2025Updated 8 months ago
Rongjiehuang / GenerSpeech
View on GitHub
PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.
☆333Feb 9, 2024Updated 2 years ago
gemelo-ai / vocos
View on GitHub
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
☆1,142Aug 7, 2024Updated last year
emo-box / EmoBox
View on GitHub
[INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark
☆321Mar 18, 2026Updated 4 months ago
keonlee9420 / PortaSpeech
View on GitHub
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
☆342Feb 17, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
huawei-noah / Speech-Backbones
View on GitHub
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
☆604Sep 18, 2023Updated 2 years ago
jishengpeng / TextrolSpeech
View on GitHub
[ICASSP 2024] TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models
☆187Nov 22, 2024Updated last year
KdaiP / StableTTS
View on GitHub
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
☆438Sep 13, 2024Updated last year
anonymous-pits / pits
View on GitHub
PITS: Variational Pitch Inference for End-to-end Pitch-controllable TTS without External Pitch Predictor
☆280Jul 16, 2023Updated 3 years ago
thuhcsi / SECap
View on GitHub
☆181Jul 9, 2024Updated 2 years ago
deepvk / emospeech
View on GitHub
☆129Aug 19, 2024Updated last year
ConsistencyVC / ConsistencyVC-voive-conversion
View on GitHub
Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion
☆153Oct 16, 2023Updated 2 years ago
tarepan / SpeechMOS
View on GitHub
Easy-to-Use Speech MOS predictors
☆360Oct 24, 2023Updated 2 years ago
bshall / hubert
View on GitHub
HuBERT content encoders for: A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion
☆405Oct 1, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
jaywalnut310 / vits
View on GitHub
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
☆7,885Dec 6, 2023Updated 2 years ago
keonlee9420 / DailyTalk
View on GitHub
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023
☆259Jun 5, 2025Updated last year
richardbaihe / a3t
View on GitHub
Code for paper A3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing
☆89Sep 6, 2024Updated last year
chomeyama / SiFiGAN
View on GitHub
Official implementation of the source-filter HiFiGAN vocoder
☆275Jul 29, 2023Updated 2 years ago
b04901014 / UUVC
View on GitHub
Official implementation for the paper: A Unified One-Shot Prosody and Speaker Conversion System with Self-Supervised Discrete Speech Unit…
☆83Jan 7, 2023Updated 3 years ago
AI-S2-Lab / GPT-Talker
View on GitHub
[ACMMM'2024] Generative Expressive Conversational Speech Synthesis
☆45Oct 28, 2024Updated last year
keonlee9420 / Expressive-FastSpeech2
View on GitHub
PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean,…
☆321Aug 25, 2021Updated 4 years ago