MTG / omar-rqView external linksLinks
Training, validation, and inference code for various SSL approaches and architectures.
☆77Oct 22, 2025Updated 3 months ago
Alternatives and similar repositories for omar-rq
Users that are interested in omar-rq are comparing it to the libraries listed below
Sorting:
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆195Dec 13, 2024Updated last year
- 5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs☆57Nov 19, 2025Updated 2 months ago
- ☆28Jul 31, 2025Updated 6 months ago
- Encode and decode audio samples to/from continuous and discrete compressed representations!☆101Nov 25, 2025Updated 2 months ago
- PyTorch implementation of the paper Learning Multi-Level Representations for Hierarchical Music Structure Analysis presented at ISMIR 202…☆14Jan 2, 2023Updated 3 years ago
- MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows☆123Sep 2, 2025Updated 5 months ago
- Official repository of the paper "MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization".☆308Aug 4, 2025Updated 6 months ago
- Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"☆107Dec 20, 2025Updated last month
- Official PyTorch implementation of "EdVAE: Mitigating Codebook Collapse with Evidential Discrete Variational Autoencoders"☆14Sep 20, 2024Updated last year
- Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning☆16Jun 23, 2024Updated last year
- FlowMirror-HydraVox — A natively accelerated multi-head autoregressive TTS system derived from CosyVoice 3.0. It predicts multiple tokens…☆38Jan 22, 2026Updated 3 weeks ago
- State-of-the-art pretrained music models for training, evaluation, inference☆160Jan 20, 2026Updated 3 weeks ago
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆31Aug 30, 2025Updated 5 months ago
- ☆247Feb 14, 2024Updated 2 years ago
- Variable Bitrate Residual Vector Quantization for Audio Coding☆51May 1, 2025Updated 9 months ago
- Official Repository of Paper: "SynParaSpeech: Automated Synthesis of Paralinguistic Datasets for Speech Generation and Understanding" (IC…☆64Jan 27, 2026Updated 2 weeks ago
- Audio-FLAN☆160Sep 23, 2025Updated 4 months ago
- Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models☆22Jul 10, 2024Updated last year
- Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'☆153Mar 24, 2025Updated 10 months ago
- Code for the blog "Neural audio codecs: how to get audio into LLMs"☆151Oct 20, 2025Updated 3 months ago
- An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.☆197Jul 14, 2025Updated 7 months ago
- Official MATPAC implementation and trained model's weights☆26Sep 23, 2025Updated 4 months ago
- A standardized toolkit of Kernel Audio Distance (KAD)—a distribution-free, unbiased, and computationally efficient metric for evaluating …☆94Jun 12, 2025Updated 8 months ago
- ☆64May 3, 2024Updated last year
- A python algorithm to change the pitch of the voice in real time☆13Dec 13, 2020Updated 5 years ago
- A repo that builds text to music datasets from scratch, used in MuseContorlLite [ICML2025]☆27May 20, 2025Updated 8 months ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆86Dec 20, 2024Updated last year
- ☆37Jul 4, 2024Updated last year
- Official implementation for FlowSep☆69Jan 2, 2025Updated last year
- Pytorch implementation of SoundCTM☆100Mar 31, 2025Updated 10 months ago
- ☆99Jan 19, 2026Updated 3 weeks ago
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆38Jan 6, 2024Updated 2 years ago
- An automatic sample identification (ASID) system using a contrastively trained GNN encoder.☆13Sep 21, 2025Updated 4 months ago
- Fast constant-Q transform feature, c++ implement☆11Jul 6, 2023Updated 2 years ago
- A codebase for data crawling and preprocessing for TTS and ASR systems training.☆22Feb 5, 2026Updated last week
- Official PyTorch implementation of "Paralinguistics-Aware Speech-Empowered LLMs for Natural Conversation" (NeurIPS 2024)☆94Dec 3, 2024Updated last year
- small audio language model for reasoning☆86Dec 4, 2025Updated 2 months ago
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆80Nov 7, 2025Updated 3 months ago
- ☆36Mar 14, 2025Updated 11 months ago