MegEngine / End-to-end-ASR-TransformerLinks
An end to end ASR Transformer model training repo
☆13Updated 3 years ago
Alternatives and similar repositories for End-to-end-ASR-Transformer
Users that are interested in End-to-end-ASR-Transformer are comparing it to the libraries listed below
Sorting:
- This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135☆19Updated 2 years ago
- ASR project with pytorch-lightning☆20Updated 3 months ago
- [ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach☆20Updated 3 years ago
- Curriculum Vitae of Quan Wang☆15Updated last week
- SpeechNAS-Better-Trade-off-between-Latency-and-Accuracy-for-Large-Scale-Speaker-Verification☆30Updated 2 years ago
- Python implementation of CTC beam search decoder + agnostic LM scorer☆19Updated 4 years ago
- one script for xls-r/xlsr/whisper fine-tuning☆42Updated last year
- TriNet: stabilizing self-supervised learning from complete or slow collapse on ASR.☆26Updated 2 years ago
- Code and instruction on replicating the experiments done in paper: Unified Hypersphere Embedding for Speaker Recognition☆31Updated 5 years ago
- [ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…☆31Updated 3 years ago
- A Pytorch Implementations for Various Vector Quantization Methods☆30Updated 3 years ago
- Online (real-time) decoder to be used with DeepSpeech2 model☆25Updated 5 years ago
- A packaged convolutional voice activity detector for noisy environments.☆14Updated 6 years ago
- Example implementation of Monotonic Chunkwise Attention.☆52Updated 7 years ago
- Implementaion RNN tranceducer☆22Updated 6 years ago
- TF code for our CVPR2020 paper "Discriminative Multi-modality Speech Recognition"☆26Updated 3 years ago
- Anonymous ICLR Submission☆14Updated 5 years ago
- neural network based speaker embedder☆25Updated 2 years ago
- ☆25Updated 2 years ago
- ☆11Updated 3 years ago
- Dual cross modality attention audio-visual speech recognition model based on vgg transformer with hybrid CTC/attention architecture using…☆12Updated 4 years ago
- HMM, CTC, RNN-Transducer, forward-backward algorithm☆21Updated last year
- Implementation for NATv2.