TigreGotico / chatterbox-onnxLinks
chatterbox TTS + Voice Clone using onnx
☆27Updated 3 weeks ago
Alternatives and similar repositories for chatterbox-onnx
Users that are interested in chatterbox-onnx are comparing it to the libraries listed below
Sorting:
- StyleTTS2 + Vocos as a Decoder☆13Updated 10 months ago
- Soprano-Factory: Train your own 2000x realtime text-to-speech model☆156Updated 2 weeks ago
- ☆54Updated last week
- (WIP) A retrain of F5-TTS on permissively-licensed data☆13Updated 9 months ago
- Fast audio super resolution from 16khz to 48khz.☆188Updated 3 weeks ago
- Open TTS models, built for streaming on the edge☆44Updated 10 months ago
- SpeechPlus: Small LLM-Based Text-to-Speech Library 🚀☆20Updated 8 months ago
- High quality text-to-speech based on StyleTTS 2.☆71Updated last month
- StyleTTS 2 Optimized Training Fork☆33Updated 11 months ago
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆133Updated 5 months ago
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆29Updated last year
- Echo-TTS inference codebase☆84Updated last month
- Streaming and Fine-tuning for Chatterbox TTS☆262Updated 7 months ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15Updated 8 months ago
- ☆20Updated 10 months ago
- Implementation of Sesame's Conversational Speech Model for Hugging Face Transformers☆57Updated 8 months ago
- A highly optimized engine for neutts-air model to generate minutes of audio in seconds. Over 200x realtime on modern hardware!☆105Updated 2 months ago
- Very fast, accurate speaker diarization☆222Updated 3 weeks ago
- VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency☆181Updated 3 months ago
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆14Updated 2 months ago
- A lightweight Python library for running TTS models with a unified API.☆21Updated 11 months ago
- Audio tokenization, in the fastest way possible!☆53Updated last year
- A highly compressive and high-quality neural audio codec for speech models.☆227Updated this week
- Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequ…☆28Updated 4 months ago
- [Early Alpha] A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, voice activit…☆22Updated last year
- Create Unmute voice embeddings☆23Updated 2 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆70Updated 3 months ago
- A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.☆148Updated 3 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆16Updated last year
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.☆48Updated 4 months ago