An minimal Seq2Seq example of Automatic Speech Recognition (ASR) based on Transformer
☆83Apr 29, 2024Updated last year
Alternatives and similar repositories for Automatic-Speech-Recognition-from-Scratch
Users that are interested in Automatic-Speech-Recognition-from-Scratch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Pytorch implementation for DeepSpeech 2.0☆31Jul 25, 2024Updated last year
- NAR-BERT-ASR☆10Sep 27, 2021Updated 4 years ago
- The MAVD represents Mandarin Audio-Visual dataset with Depth information. MAVD has a rich variety of modal data, including audio, RGB ima…☆20Apr 22, 2024Updated last year
- AI 应用服务 平台☆35Nov 12, 2025Updated 4 months ago
- Baseline system for CNVSRC2023 (Chinese Continuous Visual Speech Recognition Challenge 2023)☆22Apr 27, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆11Aug 20, 2025Updated 7 months ago
- Chinese speech recognition | 中文语音识别 (使用AISHELL-3数据集训练语音识别模型)☆11Oct 17, 2024Updated last year
- 该项目来源于阿里开源的语音降噪模型zipEnhancer☆30Mar 4, 2025Updated last year
- Pytorch implementation of conformer with with training script for end-to-end speech recognition on the LibriSpeech dataset.☆28May 1, 2024Updated last year
- Implements of CTC, Speech-Transformer and CIF for end-to-end speech recognition with pytorch☆23Jul 28, 2020Updated 5 years ago
- Awesome Automatic Speech Recognition (ASR) paper collection☆22Sep 4, 2020Updated 5 years ago
- MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations☆34Oct 15, 2025Updated 5 months ago
- A Benchmark and Evaluation Suite for Zero-shot Singing Voice Synthesis☆24Feb 11, 2026Updated last month
- [ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels☆42Mar 20, 2024Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- 我在校园自动健康打卡程序☆14Aug 1, 2022Updated 3 years ago
- Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。☆724Dec 17, 2025Updated 3 months ago
- 中文语音识别,automatic speech recognition(ASR)☆14Dec 30, 2021Updated 4 years ago
- 基于深度学习的普通话语音识别☆18Apr 23, 2019Updated 6 years ago
- Collection of Common Machine Translation Tools☆11Jul 26, 2022Updated 3 years ago
- Localize to Binauralize: Audio Spatialization from Visual Sound Source Localization (ICCV 2021)☆10Oct 11, 2021Updated 4 years ago
- ☆18May 28, 2024Updated last year
- PyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASS…☆113Feb 27, 2022Updated 4 years ago
- Code for "Self-Lifting: A Novel Framework For Unsupervised Voice-Face Association Learning,ICMR,2022"☆15Oct 25, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ASR教程: https://dataxujing.github.io/ASR-paper/☆25Jul 1, 2024Updated last year
- ☆16Apr 27, 2025Updated 11 months ago
- [Interspeech 2023] Intelligible Lip-to-Speech Synthesis with Speech Units☆47Oct 26, 2024Updated last year
- An open source community implementation of the model MELLE from the paper: "Autoregressive Speech Synthesis without Vector Quantization"☆14Updated this week
- Seq2Seq Speech Recognition with Transformer on Mandarin Chinese☆118Dec 20, 2019Updated 6 years ago
- A simple dictionary in Manchu, Chinese and English.☆13Feb 27, 2015Updated 11 years ago
- ☆15Jan 7, 2023Updated 3 years ago
- My implement of InstantBooth☆13Sep 11, 2023Updated 2 years ago
- An overview of flutter widgets☆10Oct 7, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- 一个 Cocos Creator 3.x 的插件,能够方便的管理多个UI状态☆17Jun 25, 2023Updated 2 years ago
- The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”☆15Jan 3, 2025Updated last year
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆22Aug 5, 2024Updated last year
- This repository contains materials for the paper: Towards generating ambisonics using audio-visual cue for virtual reality☆13Jul 2, 2019Updated 6 years ago
- CHisIEC An Information Extraction Corpus for Ancient Chinese History☆20Nov 25, 2025Updated 4 months ago
- Open source code for AAAI 2024 Paper "From Retrieval to Generation: A Simple and Unified Generative Model for End-to-End Task-Oriented Di…☆22Apr 8, 2024Updated last year
- ☆13Jan 13, 2022Updated 4 years ago