oliverguhr / wav2vec2-live
A live speech recognition using Facebooks wav2vec 2.0 model.
☆336Updated 11 months ago
Alternatives and similar repositories for wav2vec2-live:
Users that are interested in wav2vec2-live are comparing it to the libraries listed below
- Segment an audio file and obtain utterance alignments. (Python package)☆325Updated 8 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆144Updated 8 months ago
- Tools for handling speech data in machine learning projects.☆972Updated 3 weeks ago
- Grapheme to phoneme conversion with deep learning.☆367Updated last year
- Large, modern dataset for speech recognition☆656Updated 10 months ago
- A fully working pytorch implementation of NaturalSpeech (Tan et al., 2022)☆471Updated 11 months ago
- 🐸 collection of TTS papers☆660Updated 6 months ago
- A fast and lightweight python-based CTC beam search decoder for speech recognition.☆434Updated last year
- Wav2Vec for speech recognition, classification, and audio classification☆253Updated 2 years ago
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …☆348Updated 10 months ago
- PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean,…☆293Updated 3 years ago
- YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone☆935Updated 2 months ago
- A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation☆520Updated last year
- Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit☆799Updated last week
- Speaker embedding (d-vector) trained with GE2E loss☆273Updated last year
- ESPnet Model Zoo☆245Updated last year
- 🐤 Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillation☆244Updated last year
- Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three differen…☆216Updated 2 years ago
- speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition☆475Updated 3 years ago
- Variational Bayes HMM over x-vectors diarization☆260Updated last year
- UniSpeech - Large Scale Self-Supervised Learning for Speech☆446Updated 9 months ago
- Open models for Coqui STT☆127Updated last year
- A tokenizer, text cleaner, and phonemizer for many human languages.☆295Updated 2 months ago
- Diarization scoring tools.☆232Updated last year
- FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion☆616Updated 9 months ago
- List of speech synthesis papers.☆1,017Updated last year
- NeMo text processing for ASR and TTS☆297Updated last week
- End-to-End Neural Diarization☆386Updated 3 years ago
- Official Implementation of StyleTTS☆410Updated this week
- This is the GitHub page for publicly available emotional speech data.☆330Updated 3 years ago