mailong25/self-supervised-speech-recognition

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mailong25/self-supervised-speech-recognition)

mailong25 / self-supervised-speech-recognition

speech to text with self-supervised learning based on wav2vec 2.0 framework

☆380

Alternatives and similar repositories for self-supervised-speech-recognition

Users that are interested in self-supervised-speech-recognition are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

eastonYi / wav2vec
View on GitHub
a simplified version of wav2vec(1.0, vq, 2.0) in fairseq
☆170Sep 21, 2020Updated 5 years ago
dangvansam / viet-asr
View on GitHub
VietASR - Vietnamese Automatic Speech Recognition
☆171Jun 18, 2026Updated last month
hirofumi0810 / neural_sp
View on GitHub
End-to-end ASR/LM implementation with PyTorch
☆594Aug 30, 2021Updated 4 years ago
kensho-technologies / pyctcdecode
View on GitHub
A fast and lightweight python-based CTC beam search decoder for speech recognition.
☆469Jul 13, 2023Updated 3 years ago
patrickvonplaten / Wav2Vec2_PyCTCDecode
View on GitHub
Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode
☆110Aug 31, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
facebookresearch / WavAugment
View on GitHub
A library for speech data augmentation in time-domain
☆689Aug 30, 2021Updated 4 years ago
Edresson / Wav2Vec-Wrapper
View on GitHub
An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.
☆80May 20, 2023Updated 3 years ago
lumaku / ctc-segmentation
View on GitHub
Segment an audio file and obtain utterance alignments. (Python package)
☆348May 15, 2024Updated 2 years ago
s3prl / s3prl
View on GitHub
Self-Supervised Speech Pre-training and Representation Learning Toolkit
☆2,558Mar 12, 2026Updated 4 months ago
asappresearch / wav2seq
View on GitHub
Official code for Wav2Seq
☆97Jul 19, 2022Updated 4 years ago
m3hrdadfi / soxan
View on GitHub
Wav2Vec for speech recognition, classification, and audio classification
☆276Apr 2, 2022Updated 4 years ago
ZhengkunTian / OpenTransformer
View on GitHub
A No-Recurrence Sequence-to-Sequence Model for Speech Recognition
☆378Jul 21, 2022Updated 4 years ago
vectominist / MiniASR
View on GitHub
A mini, simple, and fast end-to-end automatic speech recognition toolkit.
☆53Dec 6, 2022Updated 3 years ago
techiaith / docker-huggingface-stt-cy
View on GitHub
Adnabod lleferydd Cymraeg i'r Gymraeg gyda HuggingFace // Speech Recognition for Welsh with HuggingFace
☆13Nov 29, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
openspeech-team / openspeech
View on GitHub
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
☆716Jun 21, 2026Updated last month
burchim / EfficientConformer
View on GitHub
[ASRU 2021] Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition
☆221Jun 22, 2023Updated 3 years ago
khanld / ASR-Wav2vec-Finetune
View on GitHub
Finetune Wa2vec 2.0 For Speech Recognition
☆149Feb 6, 2025Updated last year
thu-spmi / CAT
View on GitHub
CAT is more than a CRF-based ASR toolkit: it provides a complete workflow for data-efficient end-to-end ASR, supporting CTC, CTC-CRF, RNN…
☆368Feb 5, 2026Updated 5 months ago
cywang97 / StreamingTransformer
View on GitHub
☆277Jan 15, 2021Updated 5 years ago
qinyuenlp / wav2vec_finetune
View on GitHub
ASR: fine-tune wav2vec 2.0 with transformers
☆21Sep 13, 2021Updated 4 years ago
facebookresearch / CPC_audio
View on GitHub
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.
☆374Oct 12, 2021Updated 4 years ago
gentaiscool / end2end-asr-pytorch
View on GitHub
End-to-End Automatic Speech Recognition on PyTorch
☆304Jun 2, 2022Updated 4 years ago
vietai / ASR
View on GitHub
End-to-End Vietnamese Speech Recognition using wav2vec 2.0
☆106Sep 3, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
farisalasmary / wav2vec2-kenlm
View on GitHub
Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding
☆74Oct 11, 2021Updated 4 years ago
ccoreilly / wav2vec2-service
View on GitHub
☆41Jan 14, 2022Updated 4 years ago
wenet-e2e / speech-recognition-papers
View on GitHub
Towards hot directions in industrial end to end speech recognition
☆329Nov 30, 2021Updated 4 years ago
facebookresearch / voxpopuli
View on GitHub
A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation
☆574Apr 2, 2023Updated 3 years ago
YoavRamon / awesome-kaldi
View on GitHub
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
☆536Feb 9, 2022Updated 4 years ago
facebookresearch / libri-light
View on GitHub
dataset for lightly supervised training using the librivox audio book recordings. https://librivox.org/.
☆528Jul 11, 2023Updated 3 years ago
yanghaha0908 / FastHuBERT
View on GitHub
Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning
☆100Nov 20, 2024Updated last year
oliverguhr / wav2vec2-live
View on GitHub
A live speech recognition using Facebooks wav2vec 2.0 model.
☆378Feb 4, 2024Updated 2 years ago
jonatasgrosman / huggingsound
View on GitHub
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
☆470Sep 20, 2023Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
kehanlu / Mandarin-Wav2Vec2
View on GitHub
Pre-trained Wav2vec2.0 for Mandarin
☆43Oct 30, 2022Updated 3 years ago
SpeechColab / GigaSpeech
View on GitHub
Large, modern dataset for speech recognition
☆731Feb 26, 2024Updated 2 years ago
sooftware / speech-transformer
View on GitHub
Transformer implementation speciaized in speech recognition tasks using Pytorch.
☆65Nov 28, 2021Updated 4 years ago
Alexander-H-Liu / End-to-end-ASR-Pytorch
View on GitHub
This is an open source project (formerly named Listen, Attend and Spell - PyTorch Implementation) for end-to-end ASR implemented with Pyt…
☆1,210Dec 19, 2020Updated 5 years ago
lhotse-speech / lhotse
View on GitHub
Tools for handling multimodal data in machine learning projects.
☆1,143Jun 22, 2026Updated last month
iver56 / torch-audiomentations
View on GitHub
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
☆1,161Nov 24, 2025Updated 8 months ago
JRMeyer / easy-kaldi
View on GitHub
Use your data to create a speech recognition system in Kaldi. Fast.
☆65Jan 2, 2020Updated 6 years ago