PyTorch implementation of Shortcut Models [Frans, 2025] with little modification
☆71Jul 11, 2025Updated 7 months ago
Alternatives and similar repositories for modified-shortcut-models-pytorch
Users that are interested in modified-shortcut-models-pytorch are comparing it to the libraries listed below
Sorting:
- ESLTTS dataset☆16Feb 6, 2025Updated last year
- PyTorch implementation for "Generative Modeling on Manifolds Through Mixture of Riemannian Diffusion Processes" (ICML 2024).☆13Jul 21, 2024Updated last year
- Code for PolyTask: Learning Unified Policies through Behavior Distillation☆12Oct 13, 2023Updated 2 years ago
- ☆22Dec 19, 2025Updated 2 months ago
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024☆16Nov 19, 2024Updated last year
- Rate-Adaptive Quantization: A Multi-Rate Codebook Adaptation for Vector Quantization-based Generative Models☆15Sep 10, 2025Updated 5 months ago
- [NeurIPS 2024] Official code for "Variational Distillation of Diffusion Policies into Mixture of Experts"☆17Dec 7, 2024Updated last year
- Prioritized Generative Replay (ICLR 2025 Oral)☆25Mar 1, 2025Updated last year
- A collection of real-time audio effect algorithms implemented in C++.☆19Jul 16, 2025Updated 7 months ago
- This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language M…☆20Jan 3, 2023Updated 3 years ago
- Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …☆22Feb 7, 2026Updated last month
- FlowMirror-HydraVox — A natively accelerated multi-head autoregressive TTS system derived from CosyVoice 3.0. It predicts multiple tokens…☆41Feb 17, 2026Updated 3 weeks ago
- An unofficial PyTorch implementation of Mix-Phoneme-Bert☆40Jul 10, 2023Updated 2 years ago
- [CoRL 2025] Pretraining code for FLOWER VLA on OXE☆32Sep 22, 2025Updated 5 months ago
- The accompanying code for "Exploring the limits of decoder-only models trained on public speech recognition corpora" (Ankit Gupta, George…☆20Oct 11, 2024Updated last year
- llama2 in Julia☆14Jul 24, 2023Updated 2 years ago
- An AR+AR TTS attempt.☆18Jan 13, 2025Updated last year
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆30May 27, 2023Updated 2 years ago
- LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence☆61Feb 21, 2022Updated 4 years ago
- Portfolio REgret for Confidence SEquences☆21Jan 6, 2026Updated 2 months ago
- Inference code for Interspeech 2025 paper, "LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec"☆35Oct 23, 2025Updated 4 months ago
- ☆19Mar 22, 2024Updated last year
- Voice conversion with just linear regression.☆35Sep 25, 2025Updated 5 months ago
- Official implementation of DNSMOS Pro (accepted at INTERSPEECH 2024).☆79Jun 8, 2025Updated 9 months ago
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Jun 1, 2024Updated last year
- ☆22Dec 15, 2023Updated 2 years ago
- ☆11Jun 11, 2025Updated 8 months ago
- Official codebase for Human Guided Exploration (HuGE)☆22Aug 16, 2023Updated 2 years ago
- A pitch detection model trained to be robust against noise and reverberation environments.☆27Jan 21, 2025Updated last year
- ☆54Jul 16, 2025Updated 7 months ago
- Unofficial PyTorch implementation of "Autoregressive Speech Synthesis without Vector Quantization (MELLE)"☆41Jun 28, 2025Updated 8 months ago
- [CoRL 2024] Official code for "Scaling Robot Policy Learning via Zero-Shot Labeling with Foundation Models"☆28Dec 11, 2024Updated last year
- 2024 Latest laughter detection & segmentaion model. Paper: "Robust Laughter Segmentation with Automatic Diverse Data Synthesis", Interspe…☆62Sep 1, 2024Updated last year
- Official code for AAAI 2026 paper (One-Step Generative Policies with Q-Learning: A Reformulation of MeanFlow)☆23Dec 15, 2025Updated 2 months ago
- List of Podcast Feeds using iTunes API and script to download 6,000,000~ hours of English speech.☆31Apr 13, 2023Updated 2 years ago
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.☆25Aug 31, 2025Updated 6 months ago
- An High-resolution implementation of HiFi-GAN Vocoder for Voice Conversion.☆32Apr 10, 2023Updated 2 years ago
- StyleTTS 2 Optimized Training Fork☆33Feb 2, 2025Updated last year
- Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023☆27Apr 27, 2023Updated 2 years ago