Implementation of the contextual biasing for ASR decoding on GPUs without lattice generation. The code supports submission to Interspeech 2023.
☆21Sep 25, 2023Updated 2 years ago
Alternatives and similar repositories for contextual-biasing-on-gpus
Users that are interested in contextual-biasing-on-gpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A pytorch wrapper for LF-MMI training and parallel training in Kaldi☆73Jun 8, 2022Updated 3 years ago
- ☆15Jul 4, 2024Updated last year
- Repository for Accent Recognition (Hackathon @SLT2022)☆38May 12, 2024Updated last year
- Conversion of recurrent neural network language models to weighted finite state transducers☆58Jun 1, 2018Updated 7 years ago
- The accompanying code for "Exploring the limits of decoder-only models trained on public speech recognition corpora" (Ankit Gupta, George…☆20Oct 11, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Lattice combination algorithm to combine inaccurate transcripts with hypothesis lattices☆16Mar 19, 2024Updated 2 years ago
- 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition☆118Jun 22, 2022Updated 3 years ago
- Losses and decoders for end-to-end ASR and OCR☆34Oct 30, 2020Updated 5 years ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆150Jan 16, 2024Updated 2 years ago
- End-to-End Speech Processing Toolkit☆15Jan 20, 2025Updated last year
- Tiny configuration for Triton Inference Server☆45Jan 10, 2025Updated last year
- 语音识别 论文 前沿☆51Jan 8, 2022Updated 4 years ago
- Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.☆13Feb 20, 2024Updated 2 years ago
- Scripts for training Kaldi for German speech recognition (ASR).☆27Feb 11, 2021Updated 5 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Data and code related to the ICASSP submission "A comparison of methods for OOV-word recognition"☆17Nov 28, 2021Updated 4 years ago
- ASR教程: https://dataxujing.github.io/ASR-paper/☆25Jul 1, 2024Updated last year
- This is a repository for a paper accepted at the 2022 IEEE Spoken Language Technology Workshop (SLT 2022)☆16Dec 1, 2022Updated 3 years ago
- "MULTIMODAL EMOTION RECOGNITION BASED ON DEEP TEMPORAL FEATURES USING CROSS-MODAL TRANSFORMER AND SELF-ATTENTION" ICASSP'23☆23Feb 26, 2023Updated 3 years ago
- ☆30Jan 22, 2026Updated 2 months ago
- Decoders from Kaldi using OpenFst☆34Jan 29, 2026Updated last month
- ☆12Jul 6, 2023Updated 2 years ago
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆15Jun 28, 2024Updated last year
- ☆60Updated this week
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Small language toolkit for creation, interpolation and pruning of ARPA language models☆92Aug 6, 2022Updated 3 years ago
- Measuring the Signal to Noise Ratio in Language Model Evaluation☆29Aug 19, 2025Updated 7 months ago
- ☆89Mar 5, 2021Updated 5 years ago
- 🔥 语音合成(TTS),语音克隆教程: https://dataxujing.github.io/TTS-paper/#/☆11Oct 29, 2024Updated last year
- Scientific Reports - Open access - Published: 14 February 2025☆51Oct 30, 2024Updated last year
- c++ Kaldi IO lib (static and dynamic).☆25Nov 26, 2018Updated 7 years ago
- Illustrating EM for GMMs and HMMs☆12May 9, 2020Updated 5 years ago
- Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering☆24Oct 19, 2023Updated 2 years ago
- Repository for sharing the data in the Tamasheq language, one of the target languages for the low-resource speech translation track at IW…☆18Nov 30, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A toolkit for benchmarking on a wide variety of audio deepfake datasets.☆29Oct 9, 2025Updated 5 months ago
- ☆15Nov 10, 2025Updated 4 months ago
- ☆32Oct 28, 2022Updated 3 years ago
- [INTERSPEECH 2023] Knowledge Transfer from Pre-trained Language Models to Cif-based Recognizers via Hierarchical Distillation☆41Sep 1, 2023Updated 2 years ago
- Official Demo Page for DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformer☆38Feb 17, 2025Updated last year
- A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech…☆12Jul 24, 2024Updated last year
- ☆40Apr 16, 2024Updated last year