rhasspy / espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
☆16Updated 9 months ago
Related projects: ⓘ
- C++ library for converting text to phonemes for Piper☆78Updated 6 months ago
- ☆18Updated 2 years ago
- A fast MP3 decoder for python, using minimp3☆26Updated 2 years ago
- a cpp ggml port of "VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech." for use in mobile…☆29Updated 3 weeks ago
- Coqui STT Model Manager - install, manage and try out Coqui STT models from the Model Zoo☆24Updated last year
- Interface for using TTS and vocoder models in the form of a text editor☆19Updated last year
- TTS Client for Coqui TTS server☆13Updated last year
- A sample Android app using [whisper.cpp](https://github.com/ggerganov/whisper.cpp/) to do voice-to-text transcriptions.☆62Updated last year
- Port of Suno AI's Bark in C/C++ for fast inference☆50Updated 5 months ago
- convert a saved pytorch model to gguf and generate as much corresponding ggml c code as possible☆13Updated 9 months ago
- Port of Meta's Encodec in C/C++☆187Updated last month
- A ggml (C++) re-implementation of tortoise-tts☆147Updated last month
- Streamlit app to visualize and edit TTS datasets☆14Updated 2 years ago
- an improved version of Real-time-voice-cloning☆45Updated 6 months ago
- Open models for Coqui STT☆119Updated last year
- DeepFloyd IF web UI☆27Updated last year
- Multivoice: Enhance your foreign-language movie and TV show experience with personalized dubbed versions. Our project uses voice cloning …☆22Updated last year
- On-device streaming text-to-speech engine powered by deep learning☆43Updated last week
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆34Updated last week
- Text-to-Music Generation with Rectified Flow Transformer☆38Updated 2 weeks ago
- Lyra V2 (SoundStream) running in the browser☆17Updated last year
- C++ version of openWakeWord☆16Updated 2 months ago
- Versatile AI-driven audio upscaler to enhance the quality of any audio.☆38Updated last week
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆27Updated last year
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.☆44Updated 10 months ago
- Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio and transcripts for personalized text to s…☆27Updated last year
- cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server a…☆15Updated this week
- ONNX-compatible Fast SeamlessM4T—Massively Multilingual & Multimodal Machine Translation☆41Updated last year
- ☆12Updated this week
- Generative voice cloning model using TTS synthesis with state-of-the-art Zero-Shot Multi-Speaker functionality. An web api built with the…☆46Updated last year