EndlessReform / smolttsLinks
Open TTS models, built for streaming on the edge
ā44Updated 10 months ago
Alternatives and similar repositories for smoltts
Users that are interested in smoltts are comparing it to the libraries listed below
Sorting:
- Audio tokenization, in the fastest way possible!ā53Updated last year
- šļø Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets āØā132Updated 5 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.ā69Updated 2 months ago
- StyleTTS 2 Optimized Training Forkā33Updated 11 months ago
- A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.ā143Updated 3 months ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.ā15Updated 8 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPā¦ā104Updated last year
- ā62Updated last year
- Official implementation of the TTS model Lina-Speechā175Updated last year
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.ā20Updated last year
- Collection of Open Source Speech Dataā164Updated 3 months ago
- [EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximationā146Updated 8 months ago
- An unofficial PyTorch implementation of VALL-Eā88Updated 5 months ago
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptionsā52Updated 4 years ago
- VoiceBox neural network implementationā110Updated last year
- Joint speech-language model - respond directly to audio!ā30Updated last year
- A collection of all our phonemeizers for dataset construction and inferenceā27Updated 10 months ago
- Trying to build an all in one speech-text language model - a bit like GPT-4oā22Updated last year
- A TTS model that makes a speaker speak new languagesā76Updated last year
- SpeechPlus: Small LLM-Based Text-to-Speech Library šā20Updated 7 months ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.ā27Updated last year
- Open-source reproducible benchmarks from Argmaxā77Updated this week
- Putting flows on top of neural transducers for better TTSā64Updated last month
- ā61Updated 2 years ago
- VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latencyā181Updated 2 months ago
- š¼ Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decompositionā14Updated 2 months ago
- ā19Updated 10 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jaxā16Updated last year
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.ā47Updated 4 months ago
- Collection of scripts from mHuBERT-147.ā32Updated last year