Conformer encoder + Transformer decoder with Hybrid CTC/attention
☆12Nov 11, 2021Updated 4 years ago
Alternatives and similar repositories for E2E-audio-speech-recognition
Users that are interested in E2E-audio-speech-recognition are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implements of CTC, Speech-Transformer and CIF for end-to-end speech recognition with pytorch☆23Jul 28, 2020Updated 5 years ago
- Dual cross modality attention audio-visual speech recognition model based on vgg transformer with hybrid CTC/attention architecture using…☆14Jul 2, 2020Updated 5 years ago
- Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.☆35Oct 18, 2021Updated 4 years ago
- Ecr-helper is a tool for call recording☆33Apr 29, 2026Updated last month
- [ICASSP 2020] Speech Emotion Recognition with Dual-Sequence LSTM Architecture☆12Jan 17, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Wav2vec2 Large XLSR 53 fine-tuned for Malayalam☆11Sep 7, 2021Updated 4 years ago
- Implementation of Hybrid CTC/Attention Architecture for End-to-End Speech Recognition in pure python and PyTorch☆26Jul 25, 2024Updated last year
- Rescoring methods for end-to-end Automatic Speech Recognition☆27Sep 23, 2020Updated 5 years ago
- Conformer RNN-Transducer☆14May 25, 2022Updated 4 years ago
- A Diffrentiable WFST-based End-to-End Automatic Speech Recognition toollkit with flexible topology support☆12Feb 15, 2026Updated 4 months ago
- A library for interfacing with the 4.3inch UART e-Paper from a Raspberry Pi 2/3 via Python3 with example programs to display QR Codes for…☆12Mar 9, 2019Updated 7 years ago
- Deformable 3D ConvNets for Action Recognition☆10Jan 21, 2018Updated 8 years ago
- This project applies Monte Carlo Tree Search (MCTS) to a simple grid world.☆10May 30, 2018Updated 8 years ago
- Python package for the extraction of speech features for sustained phonation☆12Aug 10, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆18Oct 31, 2022Updated 3 years ago
- Code for the paper: How Much Context Does My Attention-Based ASR System Need?☆11Updated this week
- The implementation of 'Watch, Listen, Attend and Spell’ (WLAS) network that learns to transcribe videos of mouth motion to character on p…☆11Mar 23, 2018Updated 8 years ago
- ☆37Dec 23, 2020Updated 5 years ago
- ☆11May 5, 2022Updated 4 years ago
- 这是一个Matlab代码,里面包括五种常见神经网络优化算法的对比。包括SGD、SGDM、Adagrad、AdaDelta、Adam☆11Mar 23, 2022Updated 4 years ago
- A Machine Learning Approach for the Diagnosis of Parkinson's Disease via Speech Analysis☆21Dec 27, 2020Updated 5 years ago
- End-to-End Automatic Speech Recognition on PyTorch☆304Jun 2, 2022Updated 4 years ago
- Scripts for computing common lyrics-to-audio alignment evaluation metrics. Usable evaluation for any token-based alignment (e.g. if tok…☆18Oct 27, 2020Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Speech Recognition for Uyghur using Speech transformer☆28Jun 19, 2021Updated 4 years ago
- An upgrade framework for train and validate compare with icefall using Lightning.☆16Mar 26, 2025Updated last year
- 2022 DCASE Challenge☆14Sep 30, 2024Updated last year
- Official code implementation of "MAD: A Military Audio Dataset for Situational Awareness and Surveillance"☆15Nov 26, 2025Updated 6 months ago
- Implementation of True Online TD(lambda) with a Fourier Basis function approximator.☆13May 9, 2015Updated 11 years ago
- Pytorch implementation of HTR on IAM dataset (word or line level + CTC loss)☆21Jul 28, 2022Updated 3 years ago
- RNN-Transducer for korean☆45Oct 31, 2020Updated 5 years ago
- Archives for Triton Inference Server Practices☆15Feb 28, 2022Updated 4 years ago
- audio/speech feature extraction using parselmouth, librosa, disvoice☆10Jan 28, 2022Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆15Oct 15, 2020Updated 5 years ago
- Multimodal Speech Recognition for phoneme level prediction using Audio-Visual data from TCDTIMIT dataset implementing RNNs with LSTMs for…☆15Jul 27, 2023Updated 2 years ago
- Development kit for Pandora☆14Aug 4, 2020Updated 5 years ago
- ☆16Nov 9, 2023Updated 2 years ago
- useful things that work with NVIDIA NeMo library☆14Jan 20, 2024Updated 2 years ago
- Serving Example of CodeGen-350M-Mono-GPTJ on Triton Inference Server with Docker and Kubernetes☆20May 30, 2023Updated 3 years ago
- PyTorch implementation of Vanilla PG, TNPG, TRPO, PPO on Mujoco environment☆12Feb 22, 2019Updated 7 years ago