[INTERSPEECH 2023] Knowledge Transfer from Pre-trained Language Models to Cif-based Recognizers via Hierarchical Distillation
☆41Sep 1, 2023Updated 2 years ago
Alternatives and similar repositories for CIF-HieraDist
Users that are interested in CIF-HieraDist are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICASSP 2020] CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition (A PyTorch implementation of Continuous Integrate-and-…☆78Jan 9, 2025Updated last year
- [ICASSP 2022] Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection☆25May 18, 2023Updated 3 years ago
- ☆16Nov 9, 2023Updated 2 years ago
- Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding☆11May 19, 2023Updated 3 years ago
- Conformer RNN-Transducer☆14May 25, 2022Updated 4 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆15Nov 26, 2024Updated last year
- ICASSP 2023: "Recursive Joint Attention for Audio-Visual Fusion in Regression Based Emotion Recognition"☆14Nov 29, 2024Updated last year
- Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning☆48Nov 8, 2023Updated 2 years ago
- A curated list of awesome papers on contextualizing E2E ASR outputs☆80May 10, 2023Updated 3 years ago
- NAR-BERT-ASR☆10Sep 27, 2021Updated 4 years ago
- This setup allows to train end-to-end neural models for spoken language understanding (SLU).☆11Jun 12, 2023Updated 3 years ago
- ☆15Aug 25, 2022Updated 3 years ago
- E2E system with LF-MMI; word N-gram for Mandarin☆167Apr 29, 2022Updated 4 years ago
- Implementation of the contextual biasing for ASR decoding on GPUs without lattice generation. The code supports submission to Interspeech…☆21Sep 25, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- End-to-End Speech Processing Toolkit☆16Jan 20, 2025Updated last year
- The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".☆32Aug 2, 2025Updated 10 months ago
- An Open-Source Tool for Automatic Disease Diagnosis..☆20Feb 25, 2024Updated 2 years ago
- Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5☆19Nov 29, 2022Updated 3 years ago
- PyTorch implementation of WASE described in our ICASSP 2021: "Wase: Learning When to Attend for Speaker Extraction in Cocktail Party Envi…☆27Jan 11, 2022Updated 4 years ago
- PyTorch re-implementation of Speech-Transformer☆102Nov 19, 2021Updated 4 years ago
- CAT is more than a CRF-based ASR toolkit: it provides a complete workflow for data-efficient end-to-end ASR, supporting CTC, CTC-CRF, RNN…☆368Feb 5, 2026Updated 4 months ago
- Baseline system for CNVSRC2023 (Chinese Continuous Visual Speech Recognition Challenge 2023)☆23Apr 27, 2024Updated 2 years ago
- open-source Mandarian biased word dataset☆14Sep 21, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-…☆51May 14, 2025Updated last year
- CleanUMamba: A Compact Mamba Network for Speech Denoising using Channel Pruning [Official PyTorch implementation]☆27Jun 12, 2025Updated last year
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated last year
- ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level…☆28Dec 4, 2024Updated last year
- End-to-end ASR repository for AGI☆20Dec 19, 2025Updated 5 months ago
- ASR, End-to-End, end2end, Speech Recognition, 端到端语音识别☆12Oct 25, 2020Updated 5 years ago
- INTERSPEECH 2023: "DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models"☆118Jan 26, 2024Updated 2 years ago
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆60Apr 4, 2024Updated 2 years ago
- "MULTIMODAL EMOTION RECOGNITION BASED ON DEEP TEMPORAL FEATURES USING CROSS-MODAL TRANSFORMER AND SELF-ATTENTION" ICASSP'23☆24Feb 26, 2023Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Code repository for the paper "Improving End-to-End SLU performance with Prosodic Attention and Distillation" accepted at Interspeech 202…☆27May 17, 2023Updated 3 years ago
- The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"☆50Apr 7, 2025Updated last year
- wenet_LLM_from_ASLP☆15Nov 26, 2024Updated last year
- Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation (INTERSPEECH 2020)☆146Aug 5, 2022Updated 3 years ago
- A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/ab…☆37Feb 10, 2024Updated 2 years ago
- 一键将视频转换为优质小红书笔记,自动优化内容和配图;追加了可以读取本地视频的功能☆12Dec 22, 2024Updated last year
- [ACL 2025 Main] A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5☆55Mar 19, 2025Updated last year