DLAS - A configuration-driven trainer for generative models
☆142Oct 11, 2022Updated 3 years ago
Alternatives and similar repositories for DL-Art-School
Users that are interested in DL-Art-School are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Performant and accurate speech recognition built on Pytorch☆254May 19, 2022Updated 3 years ago
- TorToiSe fine-tuning with DLAS☆226Aug 1, 2024Updated last year
- Community framework for training tortoise☆43Oct 29, 2022Updated 3 years ago
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models☆176Dec 18, 2023Updated 2 years ago
- Train the next generation of TTS systems.☆171Sep 13, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Fast TorToiSe inference (5x or your money back!)☆829Jul 10, 2024Updated last year
- PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling☆191Nov 18, 2021Updated 4 years ago
- Pytorch implementation of BigVSAN☆203Dec 9, 2025Updated 3 months ago
- FlowMirror-HydraVox — A natively accelerated multi-head autoregressive TTS system derived from CosyVoice 3.0. It predicts multiple tokens…☆50Feb 17, 2026Updated last month
- Tools to create your own voice dataset for TTS training☆71Oct 26, 2020Updated 5 years ago
- PPG-Based Voice Conversion☆348Jul 22, 2022Updated 3 years ago
- A multi-voice TTS system trained with an emphasis on quality☆14,827Nov 19, 2024Updated last year
- ☆259May 15, 2023Updated 2 years ago
- Create training data for training a voice cloner for bark text to speech.☆48Jun 13, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)☆283Oct 8, 2021Updated 4 years ago
- A simple script to prepare dataset for training with TTS Tortoise model via https://git.ecker.tech/mrq/ai-voice-cloning☆12Jan 12, 2024Updated 2 years ago
- State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.☆1,745Jan 26, 2026Updated 2 months ago
- Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis☆1,093Aug 7, 2024Updated last year
- ☆392Sep 3, 2024Updated last year
- LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning☆159Jun 13, 2024Updated last year
- Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.☆134Sep 25, 2023Updated 2 years ago
- Include Basis-MelGAN, MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.☆157Jul 2, 2021Updated 4 years ago
- GPT-style network for phonemization with durations of text☆68Mar 21, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Rich Prosody Diversity Modelling with Phone-level Mixture Density Network☆45Dec 1, 2021Updated 4 years ago
- AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model☆296Oct 12, 2025Updated 5 months ago
- The Open Source Code of UniAudio☆606Jul 22, 2024Updated last year
- [WIP] VoiceSmith makes training text to speech models easy.☆229Oct 10, 2022Updated 3 years ago
- Official PyTorch implementation of BigVGAN (ICLR 2023)☆1,196Sep 5, 2024Updated last year
- TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.☆89May 27, 2021Updated 4 years ago
- High-level API for tar-based dataset☆12Feb 3, 2024Updated 2 years ago
- [ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations☆142Apr 27, 2024Updated last year
- Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch☆1,334Sep 24, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Google's TPGST reimplementation.☆34Dec 11, 2019Updated 6 years ago
- SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs☆16Jul 19, 2023Updated 2 years ago
- Make-A-Video Latent Diffusion Model☆19Nov 15, 2023Updated 2 years ago
- General Speech Restoration☆1,307Feb 17, 2025Updated last year
- AcademiCodec: An Open Source Audio Codec Model for Academic Research☆671Dec 27, 2023Updated 2 years ago
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆194Jul 12, 2024Updated last year
- This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.☆87Nov 12, 2024Updated last year