etri / kmsavLinks
☆12Updated 10 months ago
Alternatives and similar repositories for kmsav
Users that are interested in kmsav are comparing it to the libraries listed below
Sorting:
- Simple tool for speech dataset augmentation for modeling various prosodies.☆14Updated 4 years ago
- DUSTED: Spoken-Term Discovery using Discrete Speech Units☆18Updated 11 months ago
- Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023☆12Updated last year
- ☆13Updated 11 months ago
- Implementation of the paper "BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition"☆17Updated 4 years ago
- ☆14Updated last year
- ☆14Updated last month
- Unofficial pytorch implementation of VISinger: Variational Inference with Adversarial Learning for End-to-end Singing Voice Synthesis (IC…☆15Updated 2 years ago
- Official implementation of the APSIPA 2022 paper: Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Updated 2 years ago
- [INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…☆14Updated last year
- ☆19Updated last year
- Implementation of the Rhythm Formant Analysis methodology for identifying speech rhythms and rhythm variation in the low frequency spectr…☆16Updated 2 years ago
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆46Updated last year
- Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor☆18Updated 2 years ago
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆10Updated 4 months ago
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Updated 2 years ago
- Code for the paper "JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis"☆13Updated 10 months ago
- Whisper Speech Quality Assessment (WhiSQA)☆15Updated 9 months ago
- Inference code for Interspeech 2025 paper, "LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec"☆24Updated last week
- Forced alignment decoder for Whisper.☆14Updated last year
- acnn for text-independent speaker recognition☆10Updated 3 years ago
- wake-up word emotion recognition [APSIPA 2022]☆17Updated 2 years ago
- A simple command line tool to calculate WER for ASR.☆14Updated 11 months ago
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆17Updated last year
- ☆11Updated 2 years ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated 2 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated 2 years ago
- Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering☆21Updated last year
- ☆11Updated 2 years ago
- Cross-Speaker Encoding Network for Multi-talker Speech Recognition☆11Updated 6 months ago