lightning830/E2E-audio-speech-recognition

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lightning830/E2E-audio-speech-recognition)

lightning830 / E2E-audio-speech-recognition

Conformer encoder + Transformer decoder with Hybrid CTC/attention

☆12

Alternatives and similar repositories for E2E-audio-speech-recognition

Users that are interested in E2E-audio-speech-recognition are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

eastonYi / end-to-end_asr_pytorch
View on GitHub
Implements of CTC, Speech-Transformer and CIF for end-to-end speech recognition with pytorch
☆23Jul 28, 2020Updated 5 years ago
LeeYongHyeok / DCM_vgg_transformer
View on GitHub
Dual cross modality attention audio-visual speech recognition model based on vgg transformer with hybrid CTC/attention architecture using…
☆14Jul 2, 2020Updated 6 years ago
zzpDapeng / Transformer-Transducer
View on GitHub
A streamable speech recognition model with transformer encoders and RNN-T loss
☆11Mar 1, 2021Updated 5 years ago
sooftware / openspeech
View on GitHub
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
☆35Oct 18, 2021Updated 4 years ago
diaoenmao / Speech-Emotion-Recognition-with-Dual-Sequence-LSTM-Architecture
View on GitHub
[ICASSP 2020] Speech Emotion Recognition with Dual-Sequence LSTM Architecture
☆12Jan 17, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
mayank-git-hub / ETE-Speech-Recognition
View on GitHub
Implementation of Hybrid CTC/Attention Architecture for End-to-End Speech Recognition in pure python and PyTorch
☆26Jul 25, 2024Updated 2 years ago
gauthamsuresh09 / wav2vec2-large-xlsr-53-malayalam
View on GitHub
Wav2vec2 Large XLSR 53 fine-tuned for Malayalam
☆11Sep 7, 2021Updated 4 years ago
diego-fustes / asr-rescoring
View on GitHub
Rescoring methods for end-to-end Automatic Speech Recognition
☆27Sep 23, 2020Updated 5 years ago
pjlintw / NNLM
View on GitHub
Implementation of "A Neural Probabilistic Language Model" by Yoshua Bengio et al. - Tensorflow
☆11Feb 2, 2023Updated 3 years ago
MonsterFanSec / Digital-Image-Encryption-Algorithm-Based-on-DES
View on GitHub
基于三重DES的数字图像加密算法，能够基于DES密码算法和分组密码运行模式，对输入的任意数字图像进行加密，并输出加密后的图像。同时也能够根据加密后的图像和DES密钥等信息，对加密的图像进行还原，使得解密后的图像和原图像保持一致。CSDN地址：https://blog.csd…
☆17Dec 11, 2024Updated last year
tuanio / conformer-rnnt
View on GitHub
Conformer RNN-Transducer
☆14May 25, 2022Updated 4 years ago
tuanio / nextformer
View on GitHub
PyTorch implementation of "Nextformer: A ConvNeXt Augmented Conformer For End-To-End Speech Recognition"
☆10Dec 15, 2022Updated 3 years ago
Bucknalla / lopy-raspberrypi
View on GitHub
🎮 Use a Raspberry Pi to control a LoPy over UART
☆12Mar 9, 2017Updated 9 years ago
AlanBaade / MAE-AST-Public
View on GitHub
Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
☆93Jun 9, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
tongjinle123 / speech-transformer-pytorch_lightning
View on GitHub
ASR project with pytorch-lightning
☆20Mar 21, 2025Updated last year
wuyinwuxian / Neural_Network_optimization_method
View on GitHub
这是一个Matlab代码，里面包括五种常见神经网络优化算法的对比。包括SGD、SGDM、Adagrad、AdaDelta、Adam
☆11Mar 23, 2022Updated 4 years ago
jarret / raspi-uart-waveshare
View on GitHub
A library for interfacing with the 4.3inch UART e-Paper from a Raspberry Pi 2/3 via Python3 with example programs to display QR Codes for…
☆12Mar 9, 2019Updated 7 years ago
TarekVito / ColorCoherenceVector
View on GitHub
Color Coherence Vector is a powerful color-based image retrieval (Matlab)
☆11Feb 27, 2015Updated 11 years ago
msalhab96 / MultiSpeech
View on GitHub
pytorch implementation for MultiSpeech: Multi-Speaker Text to Speech with Transformer paper
☆21Jun 23, 2022Updated 4 years ago
lshiwjx / deformable-3d-convnets
View on GitHub
Deformable 3D ConvNets for Action Recognition
☆10Jan 21, 2018Updated 8 years ago
ZhaoZeyu1995 / BenNevis
View on GitHub
A Diffrentiable WFST-based End-to-End Automatic Speech Recognition toollkit with flexible topology support
☆12Feb 15, 2026Updated 5 months ago
saschaschramm / MonteCarloTreeSearch
View on GitHub
This project applies Monte Carlo Tree Search (MCTS) to a simple grid world.
☆10May 30, 2018Updated 8 years ago
ZQuang2202 / Zipformer_Lightning
View on GitHub
An upgrade framework for train and validate compare with icefall using Lightning.
☆16Mar 26, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
artem179 / WLAS
View on GitHub
The implementation of 'Watch, Listen, Attend and Spell’ (WLAS) network that learns to transcribe videos of mouth motion to character on p…
☆11Mar 23, 2018Updated 8 years ago
TeaPoly / Conformer-Athena
View on GitHub
Dynamic Chunk Streaming and Offline Conformer based on athena-team/Athena.
☆44Nov 2, 2022Updated 3 years ago
wq2012 / SpeakerRecognitionCourseChinese
View on GitHub
☆17Oct 31, 2022Updated 3 years ago
Cydia2018 / AS-ViT
View on GitHub
Adaptive Sparse ViT
☆16Aug 1, 2023Updated 2 years ago
bootphon / sustained-phonation-features
View on GitHub
Python package for the extraction of speech features for sustained phonation
☆12Aug 10, 2020Updated 5 years ago
robflynnyh / long-context-asr
View on GitHub
Code for the paper: How Much Context Does My Attention-Based ASR System Need?
☆11Jul 3, 2026Updated 3 weeks ago
georgid / AlignmentEvaluation
View on GitHub
Scripts for computing common lyrics-to-audio alignment evaluation metrics. Usable evaluation for any token-based alignment (e.g. if tok…
☆18Oct 27, 2020Updated 5 years ago
kchan7 / WER-CER
View on GitHub
Calculator Tool of Word Error Rate and Character Error Rate
☆14Nov 3, 2020Updated 5 years ago
hchung12 / espnet-asr
View on GitHub
☆37Dec 23, 2020Updated 5 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
SSahuDS / Lipreading-Using-Mutimodal-Speech-Recognition
View on GitHub
Multimodal Speech Recognition for phoneme level prediction using Audio-Visual data from TCDTIMIT dataset implementing RNNs with LSTMs for…
☆15Jul 27, 2023Updated 2 years ago
tsiangleo / TensorFlowMnist
View on GitHub
☆15Apr 27, 2017Updated 9 years ago
elianap / divexplorer
View on GitHub
☆11May 5, 2022Updated 4 years ago
anushka23g / Parkinson-Disease-Classification
View on GitHub
A Machine Learning Approach for the Diagnosis of Parkinson's Disease via Speech Analysis
☆21Dec 27, 2020Updated 5 years ago
youngbin-ro / audiotext-transformer
View on GitHub
Multimodal Transformer for Korean Sentiment Analysis with Audio and Text Features
☆28Sep 7, 2021Updated 4 years ago
gentaiscool / end2end-asr-pytorch
View on GitHub
End-to-End Automatic Speech Recognition on PyTorch
☆304Jun 2, 2022Updated 4 years ago
SSTGroup / independent_vector_analysis
View on GitHub
Independent Vector Analysis (IVA-G and IVA-L-SOS) implemented in Python
☆21Nov 24, 2025Updated 8 months ago