DLAS - A configuration-driven trainer for generative models
☆142Oct 11, 2022Updated 3 years ago
Alternatives and similar repositories for DL-Art-School
Users that are interested in DL-Art-School are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Performant and accurate speech recognition built on Pytorch☆254May 19, 2022Updated 3 years ago
- TorToiSe fine-tuning with DLAS☆225Aug 1, 2024Updated last year
- Community framework for training tortoise☆42Oct 29, 2022Updated 3 years ago
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models☆176Dec 18, 2023Updated 2 years ago
- Train the next generation of TTS systems.☆170Sep 13, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Fast TorToiSe inference (5x or your money back!)☆828Jul 10, 2024Updated last year
- PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling☆191Nov 18, 2021Updated 4 years ago
- Pytorch implementation of BigVSAN☆202Dec 9, 2025Updated 4 months ago
- FlowMirror-HydraVox — A natively accelerated multi-head autoregressive TTS system derived from CosyVoice 3.0. It predicts multiple tokens…☆50Feb 17, 2026Updated 2 months ago
- Community-controlled voice data collection for language preservation and AI development. Companion to 'AI Techniques for Indigenous Cultu…☆71Updated this week
- PPG-Based Voice Conversion☆349Jul 22, 2022Updated 3 years ago
- A multi-voice TTS system trained with an emphasis on quality☆14,843Nov 19, 2024Updated last year
- ☆260May 15, 2023Updated 2 years ago
- Create training data for training a voice cloner for bark text to speech.☆48Jun 13, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A fast MP3 decoder for python, using minimp3☆30Sep 20, 2022Updated 3 years ago
- Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)☆284Oct 8, 2021Updated 4 years ago
- A simple script to prepare dataset for training with TTS Tortoise model via https://git.ecker.tech/mrq/ai-voice-cloning☆13Jan 12, 2024Updated 2 years ago
- State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.☆1,766Jan 26, 2026Updated 2 months ago
- Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis☆1,111Aug 7, 2024Updated last year
- ☆394Sep 3, 2024Updated last year
- LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning☆160Jun 13, 2024Updated last year
- Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.☆134Sep 25, 2023Updated 2 years ago
- Include Basis-MelGAN, MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.☆157Jul 2, 2021Updated 4 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- GPT-style network for phonemization with durations of text☆68Mar 21, 2024Updated 2 years ago
- Rich Prosody Diversity Modelling with Phone-level Mixture Density Network☆45Dec 1, 2021Updated 4 years ago
- AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model☆297Oct 12, 2025Updated 6 months ago
- The Open Source Code of UniAudio☆604Jul 22, 2024Updated last year
- Official PyTorch implementation of BigVGAN (ICLR 2023)☆1,205Sep 5, 2024Updated last year
- [WIP] VoiceSmith makes training text to speech models easy.☆231Oct 10, 2022Updated 3 years ago
- TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.☆89May 27, 2021Updated 4 years ago
- High-level API for tar-based dataset☆12Feb 3, 2024Updated 2 years ago
- Learning Lip Sync of Obama from Speech Audio☆67Jul 29, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations☆141Apr 27, 2024Updated last year
- Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch☆1,334Sep 24, 2023Updated 2 years ago
- Google's TPGST reimplementation.☆34Dec 11, 2019Updated 6 years ago
- SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs☆16Jul 19, 2023Updated 2 years ago
- Make-A-Video Latent Diffusion Model☆19Nov 15, 2023Updated 2 years ago
- General Speech Restoration☆1,313Feb 17, 2025Updated last year
- AcademiCodec: An Open Source Audio Codec Model for Academic Research☆670Dec 27, 2023Updated 2 years ago