bhattbhavesh91 / wav2vec2-huggingface-demo
Speech to Text with self-supervised learning based on wav2vec 2.0 framework using Hugging Face's Transformer
☆30Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for wav2vec2-huggingface-demo
- GSoC'2021 | TensorFlow implementation of Wav2Vec2☆89Updated 2 years ago
- This project is about performing Speaker diarization for Hindi Language.☆45Updated 3 years ago
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆71Updated 3 years ago
- The codebase for Data-driven general-purpose voice activity detection.☆93Updated last year
- An implementation of Speech Emotion Recognition, based on HuBERT model, training with PyTorch and HuggingFace framework, and fine-tuning …☆32Updated 2 years ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆81Updated last year
- Finetune Wa2vec 2.0 For Speech Recognition☆115Updated last year
- SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition☆70Updated 4 years ago
- Wav2Vec for speech recognition, classification, and audio classification☆249Updated 2 years ago
- Research code for the paper "Fine-tuning wav2vec2 for speaker recognition" found at https://arxiv.org/abs/2109.15053☆143Updated 2 years ago
- Estimating the Age, Height, and Gender of a speaker with their speech signal. https://arxiv.org/pdf/2110.13653.pdf☆64Updated 3 years ago
- Repository containing experimentation platform on how to train, infer on wav2vec2 models.☆85Updated 2 years ago
- Rescoring methods for end-to-end Automatic Speech Recognition☆27Updated 4 years ago
- ☆41Updated last year
- Implement a GRU/LSTM model using Keras, and train it to classify the languages using MFCC features☆27Updated 3 months ago
- a simplified version of wav2vec(1.0, vq, 2.0) in fairseq☆132Updated 4 years ago
- An implementation of the paper titled "Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED Dataset" https://…☆11Updated 2 years ago
- Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'☆126Updated 2 years ago
- Wav2Keyword is keyword spotting(KWS) based on Wav2Vec 2.0. This model shows state-of-the-art in Speech commands dataset V1 and V2.☆100Updated last year
- transformer for ASR-systerm (via tensorflow2.0)☆113Updated 5 years ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆47Updated last year
- Various speech datasets made available to the public☆99Updated last month
- WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation, and training audio classification models wi…☆89Updated 3 years ago
- SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model☆106Updated 3 years ago
- A speaker gender classifier. MFC feature engineering and a pre-trained ResNet-50. GradCAM interpretation.☆26Updated 3 years ago
- Voice Activity Detection (VAD) using deep learning.☆192Updated 5 years ago
- Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpus☆165Updated 4 months ago
- Pytorch implementation of Noisy Student Training for Automatic Speech Recognition and Automatic Pronunciation Error Detection problem☆86Updated last year
- A merged version of multiple open-source German speech datasets.☆30Updated 6 months ago
- Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition☆144Updated 3 years ago