robflynnyh / long-context-asrView external linksLinks
Code for the paper: How Much Context Does My Attention-Based ASR System Need?
☆11Jan 28, 2026Updated 2 weeks ago
Alternatives and similar repositories for long-context-asr
Users that are interested in long-context-asr are comparing it to the libraries listed below
Sorting:
- ☆16Jun 13, 2022Updated 3 years ago
- Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training☆41Dec 18, 2020Updated 5 years ago
- Perform the forced decoding with target transcription☆11Sep 12, 2018Updated 7 years ago
- ☆24Jul 15, 2024Updated last year
- ☆45Jul 15, 2022Updated 3 years ago
- A repo that builds text to music datasets from scratch, used in MuseContorlLite [ICML2025]☆27May 20, 2025Updated 8 months ago
- ☆48Jan 8, 2021Updated 5 years ago
- More than Just Words: Modeling Non-textual Characteristics of Podcasts☆26Nov 6, 2019Updated 6 years ago
- Voice100 includes neural TTS/ASR models. Inference of Voice100 is low cost as its models are tiny and only depend on CNN without autoregr …☆28Nov 23, 2023Updated 2 years ago
- Speedup the attention computation of Swin Transformer☆31Jun 14, 2025Updated 8 months ago
- Visually-informed Music Source Separation project at Jeju 2018 Deep Learning Summer Camp☆30Sep 14, 2018Updated 7 years ago
- [NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)☆34Aug 6, 2023Updated 2 years ago
- Articulatory features estimation using Listen Attend and Spell architecture.☆33Apr 24, 2020Updated 5 years ago
- [ICASSP 2020] CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition (A PyTorch implementation of Continuous Integrate-and-…☆80Jan 9, 2025Updated last year
- ☆12Nov 23, 2021Updated 4 years ago
- Punctuation restoration in ASR text☆33Jul 1, 2019Updated 6 years ago
- Korean Streaming ASR(with Denoiser and Conformer CTC)☆39Apr 28, 2024Updated last year
- Official code of ElasticAST (Interspeech 2024 paper)☆34Jul 30, 2024Updated last year
- Generalised UDRL☆37May 12, 2022Updated 3 years ago
- ☆37Jun 28, 2021Updated 4 years ago
- Implementation for the paper "Unified Multimodal Model with Unlikelihood Training for Visual Dialog"☆13May 12, 2023Updated 2 years ago
- ☆10Jul 24, 2019Updated 6 years ago
- ☆36Aug 25, 2022Updated 3 years ago
- ☆14Mar 15, 2022Updated 3 years ago
- Python API client to for Central Bank of the Republic of Türkiye (TCMB - CBRT) web service.☆14May 3, 2024Updated last year
- TÜBİTAK onaylı projede, ESP32'nin bluetooth modülü ile Flutter mobil uygulamasına sensörlerin ölçüm sonuçları gönderilmektedir.☆13Jun 19, 2022Updated 3 years ago
- Nr. 1 ranked "Pitch Detector" on the web. Implemented with WebAssembly.☆11Mar 24, 2021Updated 4 years ago
- Fast constant-Q transform feature, c++ implement☆11Jul 6, 2023Updated 2 years ago
- An automatic sample identification (ASID) system using a contrastively trained GNN encoder.☆13Sep 21, 2025Updated 4 months ago
- A pakage for crawling audio from Youtube☆42Aug 8, 2023Updated 2 years ago
- Worked example of the process from Python source to CUDA kernel execution with Numba☆45Sep 11, 2024Updated last year
- ☆46Nov 2, 2023Updated 2 years ago
- transcribe audio feeds into public web ui☆45Aug 31, 2022Updated 3 years ago
- Train a Mixture of Factor Analyzers (MFA) / Mixture of Probabilistic PCA (MPPCA) - low-rank-plus-diagonal GMMs using pytorch☆41Oct 30, 2022Updated 3 years ago
- Code for the Paper Speech Recognition and Multi-Speaker Diarization of Long Conversations☆38Jun 12, 2023Updated 2 years ago
- ☆10Dec 25, 2021Updated 4 years ago
- E2E system with LF-MMI; word N-gram for Mandarin☆166Apr 29, 2022Updated 3 years ago
- Little toolkit wrote in C to extract GPS data from Dash Cam 70mai Pro MP4 files to SRT (subtitles)☆11Jun 10, 2020Updated 5 years ago
- A Visualizer for prosodically annotated speech corpora☆12Oct 27, 2021Updated 4 years ago