lucasnewman / nanospeech
A simple, hackable text-to-speech system in PyTorch and MLX
☆146Updated last month
Alternatives and similar repositories for nanospeech:
Users that are interested in nanospeech are comparing it to the libraries listed below
- ☆104Updated this week
- ☆62Updated 8 months ago
- SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on On…☆186Updated last week
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆95Updated 5 months ago
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated 10 months ago
- Video+code lecture on building nanoGPT from scratch☆66Updated 9 months ago
- Collection of Open Source Speech Data☆152Updated 4 months ago
- python bindings for symphonia/opus - read various audio formats from python and write opus files☆54Updated last week
- Open TTS models, built for streaming on the edge☆39Updated 2 weeks ago
- Joint speech-language model - respond directly to audio!☆30Updated 10 months ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆157Updated last week
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆61Updated 3 weeks ago
- Attempt to make multiple residual streams from Bytedance's Hyper-Connections paper accessible to the public☆80Updated last month
- Official implementation of the TTS model Lina-Speech☆157Updated 2 months ago
- LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation☆91Updated this week
- Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in MLX☆20Updated 5 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆91Updated 3 weeks ago
- ☆84Updated last year
- Focused on fast experimentation and simplicity☆70Updated 3 months ago
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆238Updated 3 weeks ago
- Speaker Diarization with Transformers☆64Updated 10 months ago
- ☆53Updated 2 months ago
- ☆84Updated this week
- ☆280Updated 9 months ago
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆224Updated 2 weeks ago
- Audio tokenization, in the fastest way possible!☆49Updated 7 months ago
- Audiogen Codec☆131Updated 8 months ago
- An unofficial PyTorch implementation of VALL-E☆87Updated this week
- research impl of Native Sparse Attention (2502.11089)☆53Updated last month
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆27Updated 5 months ago