An minimal Seq2Seq example of Automatic Speech Recognition (ASR) based on Transformer
☆85Apr 29, 2024Updated 2 years ago
Alternatives and similar repositories for Automatic-Speech-Recognition-from-Scratch
Users that are interested in Automatic-Speech-Recognition-from-Scratch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Pytorch implementation for DeepSpeech 2.0☆31Jul 25, 2024Updated last year
- Baseline system for CNVSRC2023 (Chinese Continuous Visual Speech Recognition Challenge 2023)☆23Apr 27, 2024Updated 2 years ago
- Implementation of KDR-Agent, the AAAI 2025 accepted paper, focusing on knowledge-driven reasoning for autonomous agents.☆20Nov 24, 2025Updated 6 months ago
- 该项目来源于阿里开源的语音降噪模型zipEnhancer☆38May 8, 2026Updated 3 weeks ago
- Pytorch implementation of conformer with with training script for end-to-end speech recognition on the LibriSpeech dataset.☆29May 1, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 基于深度学习的对话系统、语音识别、机器翻译和语音合成等。☆13Jan 3, 2021Updated 5 years ago
- Awesome Automatic Speech Recognition (ASR) paper collection☆22Sep 4, 2020Updated 5 years ago
- A Benchmark and Evaluation Suite for Zero-shot Singing Voice Synthesis☆27Feb 11, 2026Updated 3 months ago
- Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。☆723Dec 17, 2025Updated 5 months ago
- A voice spoofing detection system, based on paper presented at ICSPIS 2021☆10Feb 11, 2022Updated 4 years ago
- 实现了简单的微博关键字爬虫+基于GPT 3.5模型的情感分析☆16Sep 7, 2023Updated 2 years ago
- Implementaion RNN tranceducer☆23Jun 25, 2019Updated 6 years ago
- Localize to Binauralize: Audio Spatialization from Visual Sound Source Localization (ICCV 2021)☆10Oct 11, 2021Updated 4 years ago
- ☆18May 28, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- PyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASS…☆114Feb 27, 2022Updated 4 years ago
- Code for "Self-Lifting: A Novel Framework For Unsupervised Voice-Face Association Learning,ICMR,2022"☆15Oct 25, 2024Updated last year
- ☆16Apr 27, 2025Updated last year
- [WWW 2026] 🕸 GlotWeb: Web Indexing for Minority Languages☆17Apr 14, 2026Updated last month
- Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis☆53Sep 20, 2025Updated 8 months ago
- Openreviewers: Multi Agent Academic Review Simulation System☆23Mar 2, 2024Updated 2 years ago
- [Interspeech 2023] Intelligible Lip-to-Speech Synthesis with Speech Units☆47Oct 26, 2024Updated last year
- An open source community implementation of the model MELLE from the paper: "Autoregressive Speech Synthesis without Vector Quantization"☆15Updated this week
- Seq2Seq Speech Recognition with Transformer on Mandarin Chinese☆117Dec 20, 2019Updated 6 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆15Jan 7, 2023Updated 3 years ago
- Segment an audio file and obtain utterance alignments. (Python package)☆347May 15, 2024Updated 2 years ago
- Official Implementation of our Interspeech 2021 paper "An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure …☆18Feb 15, 2022Updated 4 years ago
- ☆53Oct 19, 2025Updated 7 months ago
- An overview of flutter widgets☆10Oct 7, 2023Updated 2 years ago
- 🚀 海南大学编译原理 pl0 语言编译器扩充☆10Dec 19, 2020Updated 5 years ago
- [Tensorflow] A Game Theoretic approach using GAN for Phishing URL synthesis and detection☆11Nov 14, 2022Updated 3 years ago
- The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”☆15Jan 3, 2025Updated last year
- This repository contains materials for the paper: Towards generating ambisonics using audio-visual cue for virtual reality☆13Jul 2, 2019Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- CHisIEC An Information Extraction Corpus for Ancient Chinese History☆21Nov 25, 2025Updated 6 months ago
- Open source code for AAAI 2024 Paper "From Retrieval to Generation: A Simple and Unified Generative Model for End-to-End Task-Oriented Di…☆23Apr 8, 2024Updated 2 years ago
- ☆14Mar 31, 2023Updated 3 years ago
- ☆50Oct 10, 2023Updated 2 years ago
- Python scripts and datasets of the "Extremely Low-Resource Neural Machine Translation: A Case Study of Cantonese" project☆16Oct 28, 2022Updated 3 years ago
- A PyTorch implementation of MIT CSAIL's Speech2Face research paper from IEEE CVPR 2019☆13Mar 25, 2023Updated 3 years ago
- ☆11Jun 15, 2019Updated 6 years ago