MOSS-Audio-Tokenizer is a Causal Transformer-based audio tokenizer built on the CAT architecture. Trained on 3M hours of diverse audio, it supports streaming and variable bitrates, delivering SOTA reconstruction and strong performance in generation and understanding—serving as a unified interface for next-generation native audio language models.
☆202Apr 27, 2026Updated last week
Alternatives and similar repositories for MOSS-Audio-Tokenizer
Users that are interested in MOSS-Audio-Tokenizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report☆47Sep 2, 2025Updated 8 months ago
- ☆68Dec 30, 2025Updated 4 months ago
- Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"☆217Sep 19, 2024Updated last year
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆14Mar 11, 2025Updated last year
- Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …☆23Apr 13, 2026Updated 3 weeks ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Open Source code for our paper, Steering Autoregressive Music Generation with Recursive Feature Machines (Zhao et al., 2025). aka MusicRF…☆38Oct 26, 2025Updated 6 months ago
- A TTS Trained on Universal Audio.☆41Jun 6, 2025Updated 11 months ago
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation …☆94Dec 28, 2024Updated last year
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated last year
- A Singing Style Conversion Framework Based On Audio Infilling☆33Apr 28, 2025Updated last year
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆352Jul 21, 2025Updated 9 months ago
- ☆60Oct 22, 2025Updated 6 months ago
- Official code for SongEcho☆59Mar 3, 2026Updated 2 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A unified tokenizer that is capable of both extracting semantic information and enabling high-fidelity audio reconstruction.☆142Sep 19, 2025Updated 7 months ago
- ☆38Jun 16, 2024Updated last year
- ☆25Jan 24, 2023Updated 3 years ago
- ☆101Jan 19, 2026Updated 3 months ago
- Mandarin Chinese audio datasets aligned with Montreal Forced Aligner☆19Aug 13, 2024Updated last year
- Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models☆53Sep 2, 2025Updated 8 months ago
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 5 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆62Nov 1, 2024Updated last year
- Self-supervised Generative LM-based Voice Conversion☆58Apr 24, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆60Apr 28, 2026Updated last week
- Official repository of the paper "MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization".☆334Aug 4, 2025Updated 9 months ago
- Compute WER and SER for speech recognition evaluation☆27Mar 18, 2026Updated last month
- a guide to grapheme-to-phoneme conversion and phoneme list for ace singing voice synthesis engine☆43Jan 17, 2025Updated last year
- The open source code for LLM-Codec☆147Aug 18, 2024Updated last year
- A python tool help to interact with chatgpt.☆10Dec 11, 2022Updated 3 years ago
- ☆20May 7, 2025Updated 11 months ago
- List of Podcast Feeds using iTunes API and script to download 6,000,000~ hours of English speech.☆31Apr 13, 2023Updated 3 years ago
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆36Aug 30, 2025Updated 8 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A low-bitrate single-codebook 16 / 24 kHz speech codec based on focal modulation☆163Nov 30, 2025Updated 5 months ago
- [ACL 2026 Main] MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows☆133Sep 2, 2025Updated 8 months ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆20Nov 3, 2025Updated 6 months ago
- ☆158Mar 30, 2026Updated last month
- DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting☆18Mar 4, 2025Updated last year
- ☆11Feb 20, 2025Updated last year
- Sound Separation, Omni modal☆28Sep 15, 2025Updated 7 months ago