MOSS-Audio-Tokenizer is a Causal Transformer-based audio tokenizer built on the CAT architecture. Trained on 3M hours of diverse audio, it supports streaming and variable bitrates, delivering SOTA reconstruction and strong performance in generation and understanding—serving as a unified interface for next-generation native audio language models.
☆228Jun 8, 2026Updated last week
Alternatives and similar repositories for MOSS-Audio-Tokenizer
Users that are interested in MOSS-Audio-Tokenizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report☆47Sep 2, 2025Updated 9 months ago
- ☆71Dec 30, 2025Updated 5 months ago
- Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"☆217Sep 19, 2024Updated last year
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆14Mar 11, 2025Updated last year
- Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …☆23May 19, 2026Updated 3 weeks ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Open Source code for our paper, Steering Autoregressive Music Generation with Recursive Feature Machines (Zhao et al., 2025). aka MusicRF…☆40Oct 26, 2025Updated 7 months ago
- A TTS Trained on Universal Audio.☆41Jun 6, 2025Updated last year
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation …☆98Dec 28, 2024Updated last year
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated last year
- A Singing Style Conversion Framework Based On Audio Infilling☆35Apr 28, 2025Updated last year
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆358Jul 21, 2025Updated 10 months ago
- Official code for SongEcho☆64Mar 3, 2026Updated 3 months ago
- ☆60Oct 22, 2025Updated 7 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A unified tokenizer that is capable of both extracting semantic information and enabling high-fidelity audio reconstruction.☆144Sep 19, 2025Updated 8 months ago
- ☆39Jun 16, 2024Updated last year
- ☆25Jan 24, 2023Updated 3 years ago
- ☆101Jan 19, 2026Updated 4 months ago
- Mandarin Chinese audio datasets aligned with Montreal Forced Aligner☆19Aug 13, 2024Updated last year
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 7 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆62Nov 1, 2024Updated last year
- Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models☆55Sep 2, 2025Updated 9 months ago
- Self-supervised Generative LM-based Voice Conversion☆58Apr 24, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆64Apr 28, 2026Updated last month
- Compute WER and SER for speech recognition evaluation☆26Jun 6, 2026Updated last week
- a guide to grapheme-to-phoneme conversion and phoneme list for ace singing voice synthesis engine☆44Jan 17, 2025Updated last year
- Official repository of the paper "MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization".☆347Aug 4, 2025Updated 10 months ago
- The open source code for LLM-Codec☆147Aug 18, 2024Updated last year
- A python tool help to interact with chatgpt.☆10Dec 11, 2022Updated 3 years ago
- ☆20May 7, 2025Updated last year
- List of Podcast Feeds using iTunes API and script to download 6,000,000~ hours of English speech.☆31Apr 13, 2023Updated 3 years ago
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆37Aug 30, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A low-bitrate single-codebook 16 / 24 kHz speech codec based on focal modulation☆168Nov 30, 2025Updated 6 months ago
- [ACL 2026 Main] MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows☆140Sep 2, 2025Updated 9 months ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆21Nov 3, 2025Updated 7 months ago
- DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting☆18Mar 4, 2025Updated last year
- ☆162Mar 30, 2026Updated 2 months ago
- ☆11Feb 20, 2025Updated last year
- Training, validation, and inference code for various SSL approaches and architectures.☆88Apr 7, 2026Updated 2 months ago