neonbjb / ocotilloView external linksLinks
Performant and accurate speech recognition built on Pytorch
☆254May 19, 2022Updated 3 years ago
Alternatives and similar repositories for ocotillo
Users that are interested in ocotillo are comparing it to the libraries listed below
Sorting:
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models☆175Dec 18, 2023Updated 2 years ago
- DLAS - A configuration-driven trainer for generative models☆142Oct 11, 2022Updated 3 years ago
- A collection of utilities for handling IPA phones.☆26Sep 24, 2023Updated 2 years ago
- Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)☆281Oct 8, 2021Updated 4 years ago
- Just another FastSpeech 2 but cleaner code :)☆29Jun 28, 2024Updated last year
- Unofficial implementation of wavenext vocoder☆57Aug 28, 2024Updated last year
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆174Jun 9, 2023Updated 2 years ago
- ☆19Mar 22, 2024Updated last year
- A multi-voice TTS system trained with an emphasis on quality☆14,809Nov 19, 2024Updated last year
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Jun 1, 2024Updated last year
- a Neural Vocoder supporting Ring Attention, Conformer and NSF.☆24Aug 1, 2025Updated 6 months ago
- Grapheme to phoneme conversion with deep learning.☆420Dec 8, 2023Updated 2 years ago
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆267Jan 13, 2025Updated last year
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 10 months ago
- ☆11Nov 5, 2021Updated 4 years ago
- [WIP] VoiceSmith makes training text to speech models easy.☆228Oct 10, 2022Updated 3 years ago
- ☆71Jul 13, 2023Updated 2 years ago
- Audio tokenization, in the fastest way possible!☆53Aug 26, 2024Updated last year
- unofficial pytorch implementation of HiFi-GAN with fast MISR.☆15Mar 21, 2023Updated 2 years ago
- ☆15Nov 11, 2024Updated last year
- steps to perform text-based speaker diarization with kaldi toolkit☆12Nov 2, 2018Updated 7 years ago
- ☆52Jun 24, 2025Updated 7 months ago
- Speech Resynthesis and Language Modeling☆27Jun 11, 2025Updated 8 months ago
- An High-resolution implementation of HiFi-GAN Vocoder for Voice Conversion.☆32Apr 10, 2023Updated 2 years ago
- LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT☆74Sep 26, 2022Updated 3 years ago
- ☆259May 15, 2023Updated 2 years ago
- Official implementation of paper: Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-…☆39Sep 18, 2024Updated last year
- ☆59Oct 22, 2025Updated 3 months ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆52Dec 6, 2022Updated 3 years ago
- A handy dataset of noises for ASR☆22May 29, 2019Updated 6 years ago
- Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing☆71Dec 2, 2022Updated 3 years ago
- Transcribing Speech with Multinomial Diffusion, training code and models.☆80Sep 27, 2023Updated 2 years ago
- ☆61Oct 28, 2024Updated last year
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- My vocoder experiments☆31Jul 26, 2025Updated 6 months ago
- Viterbi decoding in PyTorch☆40Sep 10, 2025Updated 5 months ago
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability☆108Jan 17, 2025Updated last year
- A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project g…☆146Jun 6, 2022Updated 3 years ago
- Torch implementation of NANSY, Neural Analysis and Synthesis, arXiv:2110.14513☆64Feb 13, 2023Updated 3 years ago