neonbjb / tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
β13,510Updated last month
Alternatives and similar repositories for tortoise-tts:
Users that are interested in tortoise-tts are comparing it to the libraries listed below
- πΈπ¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and productionβ36,915Updated 5 months ago
- π Text-Prompted Generative Audio Modelβ36,678Updated 4 months ago
- π Text-prompted Generative Audio Model - With the ability to clone voicesβ3,213Updated 7 months ago
- An unofficial PyTorch implementation of the audio LM VALL-Eβ2,983Updated last year
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ5,265Updated 5 months ago
- A fast, local neural text to speech systemβ7,406Updated 2 months ago
- Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and creatβ¦β24,165Updated this week
- WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)β13,382Updated this week
- An Open Source text-to-speech system built by inverting Whisper.β4,080Updated last month
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/β7,751Updated 11 months ago
- A Gradio web UI for Large Language Models with support for multiple inference backends.β41,616Updated this week
- Faster Whisper transcription with CTranslate2β13,490Updated 2 weeks ago
- Foundational Models for State-of-the-Art Speech and Text Translationβ11,156Updated 2 months ago
- The simplest way to run LLaMA on your local machineβ13,099Updated 6 months ago
- Fast TorToiSe inference (5x or your money back!)β799Updated 6 months ago
- TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5,β¦β1,947Updated last month
- JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.β4,499Updated 9 months ago
- β7,720Updated 9 months ago
- AudioLDM: Generate speech, sound effects, music and beyond, with text.β2,527Updated last month
- Stable diffusion for real-time music generationβ3,469Updated 5 months ago
- High-Resolution Image Synthesis with Latent Diffusion Modelsβ39,774Updated 3 months ago
- πΈ Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloadingβ9,349Updated 4 months ago
- The official Python API for ElevenLabs Text to Speech.β2,332Updated 3 weeks ago
- VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speechβ7,042Updated last year
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β37,496Updated this week
- Stable Diffusion web UIβ7,888Updated 5 months ago
- AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advβ¦β1,320Updated last week
- Locally run an Instruction-Tuned Chat-Style LLMβ10,240Updated last year
- Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorchβ2,479Updated this week
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressorβ¦β21,325Updated this week