audeering / w2v2-age-gender-how-toView external linksLinks
How to use our public wav2vec2 age and gender model
☆53Sep 4, 2023Updated 2 years ago
Alternatives and similar repositories for w2v2-age-gender-how-to
Users that are interested in w2v2-age-gender-how-to are comparing it to the libraries listed below
Sorting:
- [APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Oct 19, 2022Updated 3 years ago
- Estimating the Age, Height, and Gender of a speaker with their speech signal.☆14Sep 19, 2022Updated 3 years ago
- An evaluation set for large-scale trained TTS models (Coming in Sep 2024)☆12Sep 2, 2024Updated last year
- Code and data repository for paper "VoxCeleb enrichment for Age and Gender recognition" submitted at ASRU 2021☆71Dec 18, 2021Updated 4 years ago
- S3PRL for Speech Emotion Recognition (see s3prl > downstream)☆15Feb 5, 2025Updated last year
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- ESLTTS dataset☆16Feb 6, 2025Updated last year
- ☆59Oct 22, 2025Updated 3 months ago
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆18May 17, 2024Updated last year
- MSP-Podcast Challenge Baseline Code☆30Jun 12, 2024Updated last year
- FREECODEC: A DISENTANGLED NEURAL SPEECH CODEC WITH FEWER TOKENS☆24Sep 9, 2024Updated last year
- Estimating the Age, Height, and Gender of a speaker with their speech signal. https://arxiv.org/pdf/2110.13653.pdf☆68May 12, 2021Updated 4 years ago
- Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE☆15Nov 30, 2022Updated 3 years ago
- ☆11Nov 7, 2024Updated last year
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆13Dec 4, 2024Updated last year
- ☆19Sep 10, 2024Updated last year
- ☆14Aug 1, 2025Updated 6 months ago
- Tools for the automatic detection of speech-related inhalation events and characterisation of the speech respiratory cycle.☆11Feb 17, 2024Updated 2 years ago
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆21May 26, 2025Updated 8 months ago
- Reimplementation of Miipher☆29Aug 16, 2023Updated 2 years ago
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 9 months ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆105Jan 10, 2025Updated last year
- A unified dataset of multilingual emotional human utterances☆29Jan 16, 2026Updated last month
- 2024 Latest laughter detection & segmentaion model. Paper: "Robust Laughter Segmentation with Automatic Diverse Data Synthesis", Interspe…☆62Sep 1, 2024Updated last year
- Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995☆78Dec 3, 2024Updated last year
- Just another FastSpeech 2 but cleaner code :)☆29Jun 28, 2024Updated last year
- A solution to denoising and separating for two-speaker-mixed noisy speech, using a BSRNN inspired network.☆14Aug 22, 2023Updated 2 years ago
- Official implementation of the paper titled "Age and Gender Recognition Using a Convolutional Neural Network with a Specially Designed Mu…☆27Mar 5, 2024Updated last year
- Syllable Segmentation and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model☆34Aug 27, 2023Updated 2 years ago
- MFA acoustic model training based on Opencpop☆15Sep 23, 2022Updated 3 years ago
- Goodness of Pronunciation algorithm using PyKaldi☆18Jun 12, 2022Updated 3 years ago
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- Code for the winning solution in the SE&R 2022 Challenge - SER track.☆16Mar 28, 2023Updated 2 years ago
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated 10 months ago
- ☆28Dec 22, 2021Updated 4 years ago
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions☆52Apr 1, 2021Updated 4 years ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- My vocoder experiments☆31Jul 26, 2025Updated 6 months ago
- SERAB: a multi-lingual benchmark for speech emotion recognition☆28Dec 16, 2022Updated 3 years ago