robflynnyh / long-context-asr
Code for the paper: How Much Context Does My Attention-Based ASR System Need?
☆11Updated last month
Alternatives and similar repositories for long-context-asr:
Users that are interested in long-context-asr are comparing it to the libraries listed below
- ☆16Updated 2 years ago
- DUSTED: Spoken-Term Discovery using Discrete Speech Units☆17Updated 6 months ago
- Prosodic Speech Segmentation with Transformers☆25Updated last year
- A collection of papers related to speech model compression☆24Updated last year
- Multipurpose Multi Speaker Mixture Signal Generator☆44Updated 2 months ago
- ☆12Updated 3 years ago
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆51Updated last month
- Python wrappers for Kaldi Levenshtein's distance and alignment code.☆63Updated last year
- ☆12Updated 2 months ago
- [ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…☆30Updated 3 years ago
- ☆15Updated 3 years ago
- Data and code related to the ICASSP submission "A comparison of methods for OOV-word recognition"☆17Updated 3 years ago
- An unofficial PyTorch implementation of Mix-Phoneme-Bert☆39Updated last year
- A handy dataset of noises for ASR☆21Updated 5 years ago
- Kaldi style neural network training in pytorch for use in place of nnet3 in Kaldi.☆26Updated 8 months ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training☆39Updated 4 years ago
- CMU multilingual speech repository☆31Updated 2 years ago
- Official implementation of DGP-based multi-speaker speech synthesis with PyTorch☆24Updated 4 years ago
- Pronunciation-assisted Subword Modeling☆29Updated 5 years ago
- Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"☆13Updated 2 years ago
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆51Updated last year
- ☆37Updated 6 months ago
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆46Updated 9 months ago
- Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation☆39Updated 4 years ago
- Balanced Error Rate for Speaker Diarization☆30Updated 2 years ago
- Temporary anonymous version☆22Updated last year
- ☆11Updated 2 years ago
- Whisper Speech Quality Assessment (WhiSQA)☆9Updated 4 months ago
- ☆16Updated 3 years ago