xiabingquan/Automatic-Speech-Recognition-from-Scratch

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xiabingquan/Automatic-Speech-Recognition-from-Scratch)

xiabingquan / Automatic-Speech-Recognition-from-Scratch

An minimal Seq2Seq example of Automatic Speech Recognition (ASR) based on Transformer

☆85

Alternatives and similar repositories for Automatic-Speech-Recognition-from-Scratch

Users that are interested in Automatic-Speech-Recognition-from-Scratch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jiwidi / DeepSpeech-pytorch
View on GitHub
Pytorch implementation for DeepSpeech 2.0
☆31Jul 25, 2024Updated last year
SpringHuo / MAVD
View on GitHub
The MAVD represents Mandarin Audio-Visual dataset with Depth information. MAVD has a rich variety of modal data, including audio, RGB ima…
☆20Apr 22, 2024Updated 2 years ago
ms-dot-k / AVSR
View on GitHub
PyTorch implementation of "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scorin…
☆23Apr 3, 2024Updated 2 years ago
nguyenvulebinh / AVSRCocktail
View on GitHub
Audio-Visual Speech Recognition
☆26Jul 7, 2025Updated last year
s920128 / NAR-BERT-ASR
View on GitHub
NAR-BERT-ASR
☆10Sep 27, 2021Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
SpeechEE / SpeechEE
View on GitHub
☆11Aug 20, 2025Updated 11 months ago
xuyouqian / Bert-Ner-Demo
View on GitHub
嵌套命名实体识别 Nested NER
☆19Nov 14, 2021Updated 4 years ago
Sihang-Geng / AIC_Solution
View on GitHub
National First Prize AICOMP book-borrowing recommendation codebase with two-stage modeling, multi-source candidate generation, stable sig…
☆17Apr 30, 2026Updated 2 months ago
WThirteen / asr_AISHELL-3
View on GitHub
Chinese speech recognition | 中文语音识别（使用AISHELL-3数据集训练语音识别模型）
☆11Oct 17, 2024Updated last year
DengBoCong / hlp
View on GitHub
基于深度学习的对话系统、语音识别、机器翻译和语音合成等。
☆13Jan 3, 2021Updated 5 years ago
jreremy / conformer
View on GitHub
Pytorch implementation of conformer with with training script for end-to-end speech recognition on the LibriSpeech dataset.
☆29May 1, 2024Updated 2 years ago
andi611 / Mockingjay-Speech-Representation
View on GitHub
Official Implementation of Mockingjay in Pytorch
☆55Jul 6, 2023Updated 3 years ago
eastonYi / end-to-end_asr_pytorch
View on GitHub
Implements of CTC, Speech-Transformer and CIF for end-to-end speech recognition with pytorch
☆23Jul 28, 2020Updated 5 years ago
yhsong06 / LAU-Net
View on GitHub
☆16May 23, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
yeyupiaoling / MASR
View on GitHub
Pytorch实现的流式与非流式的自动语音识别框架，同时兼容在线和离线识别，目前支持Conformer、Squeezeformer、DeepSpeech2模型，支持多种数据增强方法。
☆727Jul 6, 2026Updated 2 weeks ago
DMU-ITREC / itrec-nlp-newcomer-guide
View on GitHub
☆16Dec 3, 2025Updated 7 months ago
JacobLinCool / MPSENet
View on GitHub
Python package of MP-SENet from Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement.
☆22Nov 1, 2024Updated last year
LuZer0417 / pcycho_project
View on GitHub
实现了简单的微博关键字爬虫+基于GPT 3.5模型的情感分析
☆16Sep 7, 2023Updated 2 years ago
glynpu / asr_abc
View on GitHub
中文语音识别，automatic speech recognition(ASR)
☆14Dec 30, 2021Updated 4 years ago
MenglingD / mandarin_speech_recognition
View on GitHub
基于深度学习的普通话语音识别
☆18Apr 23, 2019Updated 7 years ago
my-yy / sl_icmr2022
View on GitHub
Code for "Self-Lifting: A Novel Framework For Unsupervised Voice-Face Association Learning,ICMR,2022"
☆15Oct 25, 2024Updated last year
ZenMule / Praat_Scripting_Tutorial
View on GitHub
Praat scripting入门
☆15Apr 8, 2025Updated last year
pabdzadeh / voice-spoof-detection-system
View on GitHub
A voice spoofing detection system, based on paper presented at ICSPIS 2021
☆10Feb 11, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
LYMDLUT / zpdb
View on GitHub
☆18May 28, 2024Updated 2 years ago
ymoslem / MT-Tools
View on GitHub
Collection of Common Machine Translation Tools
☆11Jul 26, 2022Updated 3 years ago
DataXujing / ASR-paper
View on GitHub
ASR教程: https://dataxujing.github.io/ASR-paper/
☆26Jul 1, 2024Updated 2 years ago
upskyy / Transformer-Transducer
View on GitHub
PyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASS…
☆114Feb 27, 2022Updated 4 years ago
BAI-Yeqi / SF2F_PyTorch
View on GitHub
☆16Apr 27, 2025Updated last year
msaadsaeed / SBNet
View on GitHub
Official implementation of SBNet as described in "Single-branch Network for Multimodal Training".
☆13Aug 28, 2023Updated 2 years ago
yyyanbj / experiment-for-pl0-compiler-expansion
View on GitHub
🚀 海南大学编译原理 pl0 语言编译器扩充
☆11Dec 19, 2020Updated 5 years ago
lzw-lzw / UnifiedMLLM
View on GitHub
UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
☆22Aug 5, 2024Updated last year
cogmhear / Intelligibility-Oriented-Audio-Visual-Speech-Enhancement
View on GitHub
Towards Intelligibility-Oriented Audio-Visual Speech Enhancement
☆15Sep 6, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
zcai0612 / InstantBooth
View on GitHub
My implement of InstantBooth
☆14Sep 11, 2023Updated 2 years ago
tzhengus / ManchuDict
View on GitHub
A simple dictionary in Manchu, Chinese and English.
☆14Feb 27, 2015Updated 11 years ago
ZhengkunTian / Speech-Tranformer-Pytorch
View on GitHub
Seq2Seq Speech Recognition with Transformer on Mandarin Chinese
☆117Dec 20, 2019Updated 6 years ago
lumaku / ctc-segmentation
View on GitHub
Segment an audio file and obtain utterance alignments. (Python package)
☆348May 15, 2024Updated 2 years ago
choijeongsoo / lip2speech-unit
View on GitHub
[Interspeech 2023] Intelligible Lip-to-Speech Synthesis with Speech Units
☆47Oct 26, 2024Updated last year
bbartoldson / Adversarial-Robustness-Limits
View on GitHub
ICML 2024 Paper "Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies"
☆18Jul 10, 2024Updated 2 years ago
dzy1011 / Uni-ToD
View on GitHub
Open source code for AAAI 2024 Paper "From Retrieval to Generation: A Simple and Unified Generative Model for End-to-End Task-Oriented Di…
☆24Apr 8, 2024Updated 2 years ago