☆26Dec 4, 2024Updated last year
Alternatives and similar repositories for sdhubert
Users that are interested in sdhubert are comparing it to the libraries listed below
Sorting:
- Sylber: Syllabic Embedding Representation of Speech from Raw Audio☆73Mar 17, 2025Updated 11 months ago
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆61Jul 1, 2025Updated 8 months ago
- ☆15Nov 11, 2024Updated last year
- Causal Speech Enhancement Based on a Two-Branch Nested U-Net Architecture Using Self-Supervised Speech Embeddings☆19Jun 6, 2025Updated 8 months ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆87Dec 20, 2024Updated last year
- Syllable Segmentation and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model☆34Aug 27, 2023Updated 2 years ago
- a Neural Vocoder supporting Ring Attention, Conformer and NSF.☆24Aug 1, 2025Updated 7 months ago
- ☆56May 29, 2025Updated 9 months ago
- An open-source Kazakh Emotional Text-to-Speech Dataset☆35Aug 1, 2025Updated 7 months ago
- Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation☆24Nov 12, 2025Updated 3 months ago
- The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…☆45Sep 5, 2025Updated 5 months ago
- The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"☆34Nov 23, 2023Updated 2 years ago
- FREECODEC: A DISENTANGLED NEURAL SPEECH CODEC WITH FEWER TOKENS☆24Sep 9, 2024Updated last year
- ☆31Jul 13, 2023Updated 2 years ago
- Bilingual Singing Voice Synthesis☆18Mar 25, 2024Updated last year
- An AR+AR TTS attempt.☆18Jan 13, 2025Updated last year
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆68Nov 1, 2024Updated last year
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆11Oct 21, 2022Updated 3 years ago
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆10Nov 6, 2024Updated last year
- ☆11Aug 11, 2023Updated 2 years ago
- A collection of all our phonemeizers for dataset construction and inference☆27Feb 21, 2025Updated last year
- A pitch detection model trained to be robust against noise and reverberation environments.☆27Jan 21, 2025Updated last year
- The demo page for ALMTokenizer☆59Apr 14, 2025Updated 10 months ago
- ☆11Nov 7, 2024Updated last year
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆13Dec 4, 2024Updated last year
- speex aec kalman filter☆15Mar 17, 2024Updated last year
- ☆15Jul 14, 2020Updated 5 years ago
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- ☆10Dec 22, 2023Updated 2 years ago
- Multilingual acoustic word embedding approaches applied and evaluated on GlobalPhone data.☆11Nov 3, 2020Updated 5 years ago
- Phonemes and durations labeling based on whisper small☆11Jul 7, 2024Updated last year
- text to speech☆10Mar 19, 2024Updated last year
- Source code for the EMNLP 2025 paper “DM-Codec: Distilling Multimodal Representations for Speech Tokenization”☆56Jun 1, 2025Updated 9 months ago
- HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis☆45Mar 2, 2021Updated 5 years ago
- Cantonese Grapheme-to-Phoneme Converter based on GitYCC/g2pW☆15Dec 10, 2024Updated last year
- TTS for Singlish using Tacotron2, the IMDA corpus, and Pachyderm.☆11Jan 11, 2020Updated 6 years ago
- ☆13Jan 14, 2025Updated last year
- ☆11Mar 22, 2023Updated 2 years ago