sanchit-gandhi / seq2seq-speech
Repository for fine-tuning Transformers š¤ based seq2seq speech models in JAX/Flax.
ā34Updated last year
Alternatives and similar repositories for seq2seq-speech:
Users that are interested in seq2seq-speech are comparing it to the libraries listed below
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.ā12Updated 2 years ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.ā27Updated 11 months ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/ā¦ā24Updated 9 months ago
- asr2kā48Updated 7 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPā¦ā90Updated 3 months ago
- Speech in Flax/JAXā15Updated 2 years ago
- A PyTorch Lightning Callback for pushing models to the Hugging Face Hub š¤ā”ļøā36Updated 2 years ago
- ā62Updated last month
- Collection of scripts from mHuBERT-147.ā23Updated last month
- Speaker change detection using SincNet and an LSTM/Transformerā46Updated 6 months ago
- A TTS model that makes a speaker speak new languagesā75Updated 7 months ago
- ā19Updated last year
- Transcribing Speech with Multinomial Diffusion, training code and models.ā76Updated last year
- HomebrewNLP in JAX flavour for maintable TPU-Trainingā47Updated 11 months ago
- Audio tokenization, in the fastest way possible!ā46Updated 4 months ago
- A collection of utilities for handling IPA phones.ā25Updated last year
- ā56Updated 2 years ago
- ā74Updated 3 years ago
- Dataset Release for Intent Classification from Speechā46Updated last year
- ā30Updated 3 weeks ago
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languagesā13Updated 2 years ago
- Official code for Wav2Seqā96Updated 2 years ago
- The demo page of UniAudioā34Updated 11 months ago
- ā41Updated 2 years ago
- Experiments with generating opensource language model assistantsā97Updated last year
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.ā13Updated last year
- ā84Updated 9 months ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning Pā¦ā34Updated last year
- KATube is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. From a lā¦ā22Updated 5 months ago
- A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR modelsā31Updated 3 years ago