REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR
☆14Dec 11, 2024Updated last year
Alternatives and similar repositories for reborn-uasr
Users that are interested in reborn-uasr are comparing it to the libraries listed below
Sorting:
- Demo for DART, Audio Imagination workshop submission in NeurIPS 2024☆12Apr 15, 2025Updated 10 months ago
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 10 months ago
- A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.☆29Mar 14, 2025Updated 11 months ago
- Aligntune : A Modular Toolkit for Post Training Alignment of LLMs☆35Feb 26, 2026Updated last week
- Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra☆16Dec 10, 2024Updated last year
- ☆14Aug 19, 2024Updated last year
- pytorch model for contexless-phoneme prediction from speech audio☆32Oct 30, 2025Updated 4 months ago
- DiFlow-TTS delivers low-latency zero-shot TTS via discrete flow matching and factorized speech tokens. A compact, open framework for fast…☆53Updated this week
- ☆46Jul 7, 2025Updated 8 months ago
- Variable Bitrate Residual Vector Quantization for Audio Coding☆51May 1, 2025Updated 10 months ago
- Encode and decode audio samples to/from continuous and discrete compressed representations!☆106Nov 25, 2025Updated 3 months ago
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆61Jul 1, 2025Updated 8 months ago
- This is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs☆90Sep 19, 2025Updated 5 months ago
- Prosody and Pronunciation Modification Network☆63May 5, 2025Updated 10 months ago
- Alignment examples for Interspeech 2024☆27Jul 5, 2024Updated last year
- Rectifying Self Organizing Map☆29Oct 7, 2024Updated last year
- Collection of scripts from mHuBERT-147.☆32Nov 19, 2024Updated last year
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆69Nov 1, 2024Updated last year
- ☆100Jan 19, 2026Updated last month
- A unified framework for Low-resource Audio Processing and Evaluation (SSL Pre-training and Downstream Fine-tuning)☆29Jul 9, 2024Updated last year
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆46Jul 2, 2024Updated last year
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆78Nov 1, 2024Updated last year
- ☆37Jul 15, 2025Updated 7 months ago
- Viterbi decoding in PyTorch☆41Sep 10, 2025Updated 5 months ago
- ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation☆39Nov 20, 2024Updated last year
- Sylber: Syllabic Embedding Representation of Speech from Raw Audio☆74Mar 17, 2025Updated 11 months ago
- [ICASSP 2024] Official code for FreGrad☆35May 13, 2024Updated last year
- Training code and dataset cleasing with Sidon☆80Jan 16, 2026Updated last month
- ☆10Jun 6, 2024Updated last year
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"☆45Updated this week
- [ACMMM'2024] Generative Expressive Conversational Speech Synthesis☆44Oct 28, 2024Updated last year
- VITS-based zero-shot TTS system varying with diverse style/speaker conditioning methods.☆36Sep 21, 2022Updated 3 years ago
- AudioBERT 📢 : Audio Knowledge Augmented Language Model (ICASSP 2025)☆41Feb 1, 2025Updated last year
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- Punch Out Model Synthesis - a program for constraint based tiling generation☆19Feb 1, 2026Updated last month
- [NeurIPS'22] PyTorch library to compare similarity between NN representations☆13Feb 27, 2025Updated last year
- SocksSharp provides support for Socks4/4a/5 proxy servers to HttpClient☆12Feb 3, 2021Updated 5 years ago
- An audio racing game☆24Updated this week
- LEMAS‑TTS is a multilingual zero‑shot text‑to‑speech system, supporting 10 languages: Chinese English Spanish Russian French German Ital…☆91Jan 14, 2026Updated last month