Real-Time ASR with CNN-BiLSTM: End-to-End Live Streaming Using PyTorch Lightning⚡
☆11Jan 23, 2025Updated last year
Alternatives and similar repositories for Automatic-Speech-Recognition-with-PyTorch
Users that are interested in Automatic-Speech-Recognition-with-PyTorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A toolkit for researchers in the multimodal sound separation.☆16Oct 20, 2023Updated 2 years ago
- ☆22Jul 16, 2025Updated 8 months ago
- [ICON 2020] TensorFlow Code for "End-to-End Automatic Speech Recognition System for Gujarati"☆13Jul 26, 2021Updated 4 years ago
- Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)☆12Mar 12, 2024Updated 2 years ago
- Blood Cell Detection and Classification using CNN and LSTMs☆11Jan 12, 2021Updated 5 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆62Nov 1, 2024Updated last year
- The rag pipeline for optimizing dynamic data editing.☆20Oct 30, 2025Updated 4 months ago
- 记录关于AEC的论文和代码、博客以及相关资料☆15Jul 26, 2022Updated 3 years ago
- ☆25Aug 29, 2025Updated 7 months ago
- Audio Preprocessing and finetuning of wav2vec2-large-xlsr model on AI4D Baamtu Datamation - Automatic Speech Recognition in WOLOF Data.☆18Nov 13, 2021Updated 4 years ago
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆76Jul 29, 2024Updated last year
- AdvSV stands as the first dataset developed specifically for evaluating Speaker Verification (SV) systems against adversarial attacks. I…☆11Nov 21, 2023Updated 2 years ago
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆14Nov 15, 2025Updated 4 months ago
- Detecting Broken Glass Insulators for Automated UAV Power Line Inspection Based on an Improved YOLOv8 Model☆22Sep 22, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- It is a specialization course of Python in coursera hosted by **University of Michigan**. This repository contains the solutions of the …☆20Jun 9, 2020Updated 5 years ago
- ☆21Aug 25, 2025Updated 7 months ago
- Efficient voice activity detection algorithm using long-term spectral flatness measurement☆15Feb 21, 2017Updated 9 years ago
- The implementation of "End-to-End Neural Speaker Diarization with an Iterative Adaptive Attractor Estimation", which is accepted by Neura…☆11Aug 27, 2023Updated 2 years ago
- Model configurations for scaling SE models in the paper "Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enha…☆38Aug 7, 2024Updated last year
- The code about “LABNet: A Lightweight Attentive Beamforming Network for Ad-hoc Multichannel Microphone Invariant Real-Time Speech Enhance…☆40Oct 10, 2025Updated 5 months ago
- From Python basics to Machine Learning and PyTorch Deep Learning - one day at a time, explore it all☆10May 25, 2025Updated 10 months ago
- wsj0-{2, 3, 4, 5} mix generation scripts, in Python.☆78Mar 17, 2021Updated 5 years ago
- ☆32Oct 23, 2025Updated 5 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Both audio-only and audio-visual speaker diarization datasets are listed here.☆15Feb 22, 2023Updated 3 years ago
- Official code for PEEKABOO2: Adapting Peekaboo with Segment Anything Model for Unsupervised Object Localization in Images and Videos.☆30Dec 27, 2025Updated 3 months ago
- ASLP Summer Inter@NPU☆12Jul 30, 2024Updated last year
- Power-Guided Grouped SRU for Real-Time Causal Audio-Visual Speech Separation☆24Nov 4, 2025Updated 4 months ago
- ☆169Mar 11, 2026Updated 2 weeks ago
- ☆11Mar 18, 2024Updated 2 years ago
- Code repository for paper "DAS-N2N: Machine learning Distributed Acoustic Sensing (DAS) signal denoising without clean data" (https://arx…☆40Jan 23, 2025Updated last year
- ☆23Jul 17, 2024Updated last year
- Keypoint Detection Using Detectron 2 on Custom Dataset☆24Sep 10, 2021Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ManiBox: Enhancing Spatial Grasping Generalization via Scalable Simulation Data Generation☆49Apr 14, 2025Updated 11 months ago
- 🤫A Lightweight One-Shot Whisper to Normal Voice Conversion Model Using Distillation of Self-Supervised Features☆23Dec 10, 2025Updated 3 months ago
- ☆23Jul 1, 2025Updated 8 months ago
- Variations of L1 SNR Loss function for training audio source separation machine learning models☆43Feb 24, 2026Updated last month
- This is the code and dataset repo for Interspeech 2024 paper "Target conversation extraction: Source separation using turn-taking dynamic…☆56Aug 15, 2025Updated 7 months ago
- Official PyTorch implementation of 'Rec-RIR: Monaural Blind Room Impulse Response Identification via DNN-based Reverberant Speech Reconst…☆32Dec 25, 2025Updated 3 months ago
- The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…☆45Sep 5, 2025Updated 6 months ago