An minimal Seq2Seq example of Automatic Speech Recognition (ASR) based on Transformer
☆85Apr 29, 2024Updated 2 years ago
Alternatives and similar repositories for Automatic-Speech-Recognition-from-Scratch
Users that are interested in Automatic-Speech-Recognition-from-Scratch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- NAR-BERT-ASR☆10Sep 27, 2021Updated 4 years ago
- Implementation of KDR-Agent, the AAAI 2025 accepted paper, focusing on knowledge-driven reasoning for autonomous agents.☆18Nov 24, 2025Updated 5 months ago
- The MAVD represents Mandarin Audio-Visual dataset with Depth information. MAVD has a rich variety of modal data, including audio, RGB ima…☆20Apr 22, 2024Updated 2 years ago
- ☆10Aug 20, 2025Updated 8 months ago
- Chinese speech recognition | 中文语音识别 (使用AISHELL-3数据集训练语音识别模型)☆11Oct 17, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 嵌套命名实体识别 Nested NER☆20Nov 14, 2021Updated 4 years ago
- AI 应用服务平台☆52Nov 12, 2025Updated 5 months ago
- Official Implementation of Mockingjay in Pytorch☆56Jul 6, 2023Updated 2 years ago
- Pytorch implementation of conformer with with training script for end-to-end speech recognition on the LibriSpeech dataset.☆29May 1, 2024Updated 2 years ago
- 基于深度学习的对话系统、语音识别、机器翻译和语音合成等。☆13Jan 3, 2021Updated 5 years ago
- a PyTorch implementation of Lip2Wav☆50Oct 2, 2022Updated 3 years ago
- Awesome Automatic Speech Recognition (ASR) paper collection☆22Sep 4, 2020Updated 5 years ago
- ☆17Jun 21, 2024Updated last year
- A Benchmark and Evaluation Suite for Zero-shot Singing Voice Synthesis☆26Feb 11, 2026Updated 2 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。☆722Dec 17, 2025Updated 4 months ago
- MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations☆38Oct 15, 2025Updated 6 months ago
- A voice spoofing detection system, based on paper presented at ICSPIS 2021☆10Feb 11, 2022Updated 4 years ago
- Implementaion RNN tranceducer☆23Jun 25, 2019Updated 6 years ago
- Localize to Binauralize: Audio Spatialization from Visual Sound Source Localization (ICCV 2021)☆10Oct 11, 2021Updated 4 years ago
- Collection of Common Machine Translation Tools☆11Jul 26, 2022Updated 3 years ago
- 基于深度学习的普通话语音识别☆18Apr 23, 2019Updated 7 years ago
- PyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASS…☆114Feb 27, 2022Updated 4 years ago
- Code for "Self-Lifting: A Novel Framework For Unsupervised Voice-Face Association Learning,ICMR,2022"☆15Oct 25, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ASR教程: https://dataxujing.github.io/ASR-paper/☆26Jul 1, 2024Updated last year
- ☆16Apr 27, 2025Updated last year
- [WWW 2026] 🕸 GlotWeb: Web Indexing for Minority Languages☆17Apr 14, 2026Updated 3 weeks ago
- Official implementation of SBNet as described in "Single-branch Network for Multimodal Training".☆13Aug 28, 2023Updated 2 years ago
- [Interspeech 2023] Intelligible Lip-to-Speech Synthesis with Speech Units☆47Oct 26, 2024Updated last year
- A simple dictionary in Manchu, Chinese and English.☆14Feb 27, 2015Updated 11 years ago
- Seq2Seq Speech Recognition with Transformer on Mandarin Chinese☆117Dec 20, 2019Updated 6 years ago
- ☆15Jan 7, 2023Updated 3 years ago
- Segment an audio file and obtain utterance alignments. (Python package)☆347May 15, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Official Implementation of our Interspeech 2021 paper "An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure …☆18Feb 15, 2022Updated 4 years ago
- Praat scripting入门☆15Apr 8, 2025Updated last year
- 🚀 海南大学编译原理 pl0 语言编译器扩充☆10Dec 19, 2020Updated 5 years ago
- The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”☆15Jan 3, 2025Updated last year
- This repository contains materials for the paper: Towards generating ambisonics using audio-visual cue for virtual reality☆13Jul 2, 2019Updated 6 years ago
- Open source code for AAAI 2024 Paper "From Retrieval to Generation: A Simple and Unified Generative Model for End-to-End Task-Oriented Di…☆23Apr 8, 2024Updated 2 years ago
- ☆13Jan 13, 2022Updated 4 years ago