mmderakhshani / NeoBabelView external linksLinks
Official implementation of the paper: "NeoBabel: A Multilingual Open Tower for Visual Generation"
☆22Aug 4, 2025Updated 6 months ago
Alternatives and similar repositories for NeoBabel
Users that are interested in NeoBabel are comparing it to the libraries listed below
Sorting:
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- Open Source code for our paper, Steering Autoregressive Music Generation with Recursive Feature Machines (Zhao et al., 2025). aka MusicRF…☆31Oct 26, 2025Updated 3 months ago
- ☆12Feb 3, 2026Updated last week
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 10 months ago
- A TTS Trained on Universal Audio.☆41Jun 6, 2025Updated 8 months ago
- ☆11Feb 20, 2025Updated 11 months ago
- My vocoder experiments☆31Jul 26, 2025Updated 6 months ago
- ☆16Dec 12, 2023Updated 2 years ago
- Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …☆22Feb 7, 2026Updated last week
- [ICLR2026] AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model☆53Oct 12, 2025Updated 4 months ago
- ☆15Aug 22, 2025Updated 5 months ago
- Zero-shot voice cloning text-to-speech (TTS) with explicit emotion class conditioning built on F5-TTS☆28Jan 9, 2026Updated last month
- MOSS-Audio-Tokenizer is a Causal Transformer-based audio tokenizer built on the CAT architecture. Trained on 3M hours of diverse audio, i…☆85Updated this week
- Official Implementation for the paper: A Variational Framework for Improving Naturalness in Generative Spoken Language Models☆22Jun 18, 2025Updated 7 months ago
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆18May 17, 2024Updated last year
- ☆47Aug 31, 2024Updated last year
- Unofficial implementation of wavenext vocoder☆57Aug 28, 2024Updated last year
- Speech Resynthesis and Language Modeling☆27Jun 11, 2025Updated 8 months ago
- Variable Bitrate Residual Vector Quantization for Audio Coding☆51May 1, 2025Updated 9 months ago
- ☆22Apr 4, 2023Updated 2 years ago
- Voice conversion with just linear regression.☆33Sep 25, 2025Updated 4 months ago
- 基于FreeVC的歌声转换☆21Dec 16, 2022Updated 3 years ago
- [ASRU 2023] Code of paper SALT: Distinguishable Speaker Anonymization Through Latent Space Transformation☆21Aug 13, 2024Updated last year
- Implementation of RIFT-SVC, a singing voice conversion model based on Rectified Flow Transformer.☆55Nov 10, 2025Updated 3 months ago
- ☆25Jan 24, 2023Updated 3 years ago
- A neural speech codec based on discrete WavLM representations☆24Aug 28, 2024Updated last year
- Evaluate your agent memory on real-world dialogues, not LLM-simulated dialogues.☆36Jul 3, 2025Updated 7 months ago
- ☆82Dec 31, 2025Updated last month
- Additional material for the paper ADTOF: A large dataset of non-synthetic music for automatic drum transcription☆68Sep 18, 2025Updated 4 months ago
- A Singing Style Conversion Framework Based On Audio Infilling☆33Apr 28, 2025Updated 9 months ago
- Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023☆27Apr 27, 2023Updated 2 years ago
- Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis☆50Sep 20, 2025Updated 4 months ago
- DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors☆35Feb 11, 2025Updated last year
- This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".☆63Nov 5, 2025Updated 3 months ago
- Codebase and project page for EDMSound☆35Nov 20, 2023Updated 2 years ago
- Official Implementation for StreamFlow: Streamlined Multi-Frame Optical Flow Estimation for Video Sequences, NeurIPS' 24☆40Mar 10, 2025Updated 11 months ago
- a guide to grapheme-to-phoneme conversion and phoneme list for ace singing voice synthesis engine☆41Jan 17, 2025Updated last year
- LSLM implements full duplex modeling in interactive speech language models, based on research by Ma et al. (2024). This project advances …☆85Jun 22, 2025Updated 7 months ago
- Prepend universal audio attack segment to mute Whisper☆36Jan 22, 2025Updated last year