Mobile-Artificial-Intelligence / babylon.cppLinks
Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port of the DeepPhonemizer model is used. For speech synthesis VITS models are used. Piper models are compatible after a conversion script is run.
☆29Updated 5 months ago
Alternatives and similar repositories for babylon.cpp
Users that are interested in babylon.cpp are comparing it to the libraries listed below
Sorting:
- ☆55Updated 3 weeks ago
- A ggml (C++) re-implementation of tortoise-tts☆193Updated last year
- a cpp ggml port of "VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech." for use in mobile…☆43Updated last year
- Using OpenVINO to speed up MeloTTS inference☆15Updated last year
- Open TTS models, built for streaming on the edge☆45Updated 10 months ago
- C++ library for converting text to phonemes for Piper☆138Updated 6 months ago
- StyleTTS 2 Optimized Training Fork☆33Updated last year
- Voxtral: Convert Mistral into a end2end SpeechLM. No information bottleneck, preserves prosody, learns interruptions from data. Unlike GP…☆40Updated 11 months ago
- Kanade is a single-layer disentangled speech tokenizer that extracts compact tokens suitable for both generative and discriminative model…☆47Updated last week
- A lightweight, efficient variation of the StyleTTS 2 text‐to‐speech model.☆52Updated 8 months ago
- A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.☆149Updated last week
- On-device streaming text-to-speech engine powered by deep learning☆127Updated 2 weeks ago
- Audio tokenization, in the fastest way possible!☆53Updated last year
- Experiments to test different speech recognition systems for SEPIA Framework☆63Updated 2 years ago
- IPA Phonemizer/Dephonemizer for 140 human languages☆53Updated 3 weeks ago
- TTS support with GGML☆218Updated 4 months ago
- ☆21Updated 9 months ago
- zero-shot realtime TTS system, fully offline, free and open source☆50Updated 9 months ago
- VoiceBox neural network implementation☆110Updated last year
- 🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning☆161Updated last year
- A high-throughput and memory-efficient inference and serving engine for Whisper, https://mesolitica.com/blog/vllm-whisper☆31Updated last year
- Official implementation of the TTS model Lina-Speech☆176Updated last year
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆135Updated 5 months ago
- Soprano-Factory: Train your own 2000x realtime text-to-speech model☆203Updated 3 weeks ago
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.☆48Updated 4 months ago
- Putting flows on top of neural transducers for better TTS☆65Updated 2 weeks ago
- Self hosted high quality voice recognition for de-googled Android using whisper. Like Siri or OK Google.☆68Updated 2 years ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆219Updated 9 months ago
- This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speech…☆17Updated 2 years ago
- StyleTTS2 + Vocos as a Decoder☆13Updated 10 months ago