ali-vilab / alitokView external linksLinks
[ICLR2026] AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model
☆53Oct 12, 2025Updated 4 months ago
Alternatives and similar repositories for alitok
Users that are interested in alitok are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] ViewPoint: Panoramic Video Generation with Pretrained Diffusion Models☆28Jul 1, 2025Updated 7 months ago
- ☆28Mar 4, 2025Updated 11 months ago
- ☆15Aug 22, 2025Updated 5 months ago
- Official Implementation for the paper: A Variational Framework for Improving Naturalness in Generative Spoken Language Models☆22Jun 18, 2025Updated 7 months ago
- Official implementation of the paper: "NeoBabel: A Multilingual Open Tower for Visual Generation"☆22Aug 4, 2025Updated 6 months ago
- Zero-shot voice cloning text-to-speech (TTS) with explicit emotion class conditioning built on F5-TTS☆28Jan 9, 2026Updated last month
- Speech Resynthesis and Language Modeling☆27Jun 11, 2025Updated 8 months ago
- Implementation of the paper "MaskBit: Embedding-free Image Generation from Bit Tokens"☆88Apr 10, 2025Updated 10 months ago
- [Preprint] GMem: A Modular Approach for Ultra-Efficient Generative Models☆42Mar 11, 2025Updated 11 months ago
- ☆31Jul 16, 2025Updated 6 months ago
- ☆99Jan 19, 2026Updated 3 weeks ago
- Where is the "main theme" in an orchestral score?☆12Oct 25, 2025Updated 3 months ago
- Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report☆49Sep 2, 2025Updated 5 months ago
- ☆25Jan 24, 2023Updated 3 years ago
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 10 months ago
- ☆12Feb 3, 2026Updated last week
- A neural speech codec based on discrete WavLM representations☆24Aug 28, 2024Updated last year
- A TTS Trained on Universal Audio.☆41Jun 6, 2025Updated 8 months ago
- Incorporating AutoVocoder to MB-iSTFT-VITS☆48Dec 1, 2022Updated 3 years ago
- Variable Bitrate Residual Vector Quantization for Audio Coding☆51May 1, 2025Updated 9 months ago
- ☆11Feb 20, 2025Updated 11 months ago
- Official PyTorch implementation of (ICME2025 oral) "AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-…☆17Feb 1, 2026Updated last week
- ☆22Jul 30, 2025Updated 6 months ago
- MU-GAN: Facial Attribute Editing based on Multi-attention Mechanism☆12Jun 7, 2020Updated 5 years ago
- ☆35Apr 8, 2025Updated 10 months ago
- This is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs☆85Sep 19, 2025Updated 4 months ago
- faster inference☆28Jan 20, 2025Updated last year
- Explore how to get a VQ-VAE models efficiently!☆67Jul 24, 2025Updated 6 months ago
- DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors☆35Feb 11, 2025Updated last year
- Codebase and project page for EDMSound☆35Nov 20, 2023Updated 2 years ago
- This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".☆63Nov 5, 2025Updated 3 months ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆86Dec 20, 2024Updated last year
- FACM: Flow-Anchored Consistency Models☆139Aug 6, 2025Updated 6 months ago
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆14Jun 28, 2024Updated last year
- ☆13Mar 11, 2025Updated 11 months ago
- FlexTok: Resampling Images into 1D Token Sequences of Flexible Length☆290Jun 2, 2025Updated 8 months ago
- My vocoder experiments☆31Jul 26, 2025Updated 6 months ago
- [NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models☆286Dec 4, 2024Updated last year