backspacetg / distilXLSRLinks
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆12Updated 2 months ago
Alternatives and similar repositories for distilXLSR
Users that are interested in distilXLSR are comparing it to the libraries listed below
Sorting:
- We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…☆30Updated 3 months ago
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆40Updated 9 months ago
- Streaming Vocos☆26Updated 4 months ago
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆8Updated 8 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆71Updated 7 months ago
- Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis☆26Updated 2 months ago
- Self-supervised Generative LM-based Voice Conversion☆36Updated last month
- The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…☆26Updated 2 weeks ago
- ☆28Updated 3 weeks ago
- ☆14Updated last month
- A toolkit dedicate for speech evaluation.☆20Updated 8 months ago
- faster inference☆28Updated 4 months ago
- Source code for DM-Codec.☆43Updated this week
- Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor☆18Updated 2 years ago
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆13Updated last month
- ☆10Updated 6 months ago
- ☆9Updated last year
- Official repository for Mamba-based Segmentation Model for Speaker Diarization☆36Updated 3 weeks ago
- [Findings of NAACL 2024] Source code of paper CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers a…☆65Updated last year
- FREECODEC: A DISENTANGLED NEURAL SPEECH CODEC WITH FEWER TOKENS☆20Updated 8 months ago
- A neural speech codec based on discrete WavLM representations☆24Updated 9 months ago
- (WIP)long form speech generatoins☆31Updated 2 months ago
- Just another FastSpeech 2 but cleaner code :)☆26Updated 11 months ago
- Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models☆46Updated last month
- ☆18Updated last year
- A Low-Frame-Rate, Semantically-Enhanced Neural Audio Codec for Speech Generation☆32Updated last week
- ☆15Updated last year
- The demo page for ALMTokenizer☆48Updated last month
- The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-…☆37Updated 3 weeks ago
- The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"☆32Updated last year