MOSS-Audio-Tokenizer is a Causal Transformer-based audio tokenizer built on the CAT architecture. Trained on 3M hours of diverse audio, it supports streaming and variable bitrates, delivering SOTA reconstruction and strong performance in generation and understanding—serving as a unified interface for next-generation native audio language models.
☆181Mar 6, 2026Updated last month
Alternatives and similar repositories for MOSS-Audio-Tokenizer
Users that are interested in MOSS-Audio-Tokenizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report☆47Sep 2, 2025Updated 7 months ago
- ☆68Dec 30, 2025Updated 3 months ago
- Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"☆214Sep 19, 2024Updated last year
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆13Mar 11, 2025Updated last year
- Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …☆23Mar 17, 2026Updated 3 weeks ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Open Source code for our paper, Steering Autoregressive Music Generation with Recursive Feature Machines (Zhao et al., 2025). aka MusicRF…☆38Oct 26, 2025Updated 5 months ago
- A TTS Trained on Universal Audio.☆41Jun 6, 2025Updated 10 months ago
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation …☆93Dec 28, 2024Updated last year
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated last year
- A Singing Style Conversion Framework Based On Audio Infilling☆33Apr 28, 2025Updated 11 months ago
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆349Jul 21, 2025Updated 8 months ago
- ☆60Oct 22, 2025Updated 5 months ago
- Official code for SongEcho☆57Mar 3, 2026Updated last month
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A unified tokenizer that is capable of both extracting semantic information and enabling high-fidelity audio reconstruction.☆139Sep 19, 2025Updated 6 months ago
- ☆38Jun 16, 2024Updated last year
- ☆100Jan 19, 2026Updated 2 months ago
- ☆25Jan 24, 2023Updated 3 years ago
- Mandarin Chinese audio datasets aligned with Montreal Forced Aligner☆19Aug 13, 2024Updated last year
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 5 months ago
- Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models☆52Sep 2, 2025Updated 7 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆62Nov 1, 2024Updated last year
- Self-supervised Generative LM-based Voice Conversion☆57Apr 24, 2025Updated 11 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆57Feb 12, 2026Updated 2 months ago
- Official repository of the paper "MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization".☆323Aug 4, 2025Updated 8 months ago
- Compute WER and SER for speech recognition evaluation☆27Mar 18, 2026Updated 3 weeks ago
- a guide to grapheme-to-phoneme conversion and phoneme list for ace singing voice synthesis engine☆42Jan 17, 2025Updated last year
- The open source code for LLM-Codec☆146Aug 18, 2024Updated last year
- A python tool help to interact with chatgpt.☆10Dec 11, 2022Updated 3 years ago
- ☆20May 7, 2025Updated 11 months ago
- List of Podcast Feeds using iTunes API and script to download 6,000,000~ hours of English speech.☆31Apr 13, 2023Updated 3 years ago
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆36Aug 30, 2025Updated 7 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [ACL 2026 Main] MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows☆131Sep 2, 2025Updated 7 months ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆20Nov 3, 2025Updated 5 months ago
- ☆158Mar 30, 2026Updated 2 weeks ago
- DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting☆17Mar 4, 2025Updated last year
- ☆11Feb 20, 2025Updated last year
- Sound Separation, Omni modal☆28Sep 15, 2025Updated 7 months ago
- Training, validation, and inference code for various SSL approaches and architectures.☆85Apr 7, 2026Updated last week