[INTERSPEECH 2023] Knowledge Transfer from Pre-trained Language Models to Cif-based Recognizers via Hierarchical Distillation
☆41Sep 1, 2023Updated 2 years ago
Alternatives and similar repositories for CIF-HieraDist
Users that are interested in CIF-HieraDist are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICASSP 2020] CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition (A PyTorch implementation of Continuous Integrate-and-…☆79Jan 9, 2025Updated last year
- [ICASSP 2022] Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection☆25May 18, 2023Updated 2 years ago
- ☆16Nov 9, 2023Updated 2 years ago
- Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding☆11May 19, 2023Updated 2 years ago
- Conformer RNN-Transducer☆14May 25, 2022Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆14Nov 26, 2024Updated last year
- ICASSP 2023: "Recursive Joint Attention for Audio-Visual Fusion in Regression Based Emotion Recognition"☆14Nov 29, 2024Updated last year
- Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning☆48Nov 8, 2023Updated 2 years ago
- A curated list of awesome papers on contextualizing E2E ASR outputs☆80May 10, 2023Updated 2 years ago
- NAR-BERT-ASR☆10Sep 27, 2021Updated 4 years ago
- ☆24Mar 11, 2026Updated 2 weeks ago
- This setup allows to train end-to-end neural models for spoken language understanding (SLU).☆11Jun 12, 2023Updated 2 years ago
- ☆15Aug 25, 2022Updated 3 years ago
- Implementation of the contextual biasing for ASR decoding on GPUs without lattice generation. The code supports submission to Interspeech…☆21Sep 25, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- E2E system with LF-MMI; word N-gram for Mandarin☆167Apr 29, 2022Updated 3 years ago
- End-to-End Speech Processing Toolkit☆15Jan 20, 2025Updated last year
- The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".☆30Aug 2, 2025Updated 7 months ago
- An Open-Source Tool for Automatic Disease Diagnosis..☆20Feb 25, 2024Updated 2 years ago
- Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5☆19Nov 29, 2022Updated 3 years ago
- PyTorch implementation of WASE described in our ICASSP 2021: "Wase: Learning When to Attend for Speaker Extraction in Cocktail Party Envi…☆27Jan 11, 2022Updated 4 years ago
- PyTorch re-implementation of Speech-Transformer☆102Nov 19, 2021Updated 4 years ago
- Baseline system for CNVSRC2023 (Chinese Continuous Visual Speech Recognition Challenge 2023)☆22Apr 27, 2024Updated last year
- A CRF-based ASR Toolkit☆366Feb 5, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- open-source Mandarian biased word dataset☆14Sep 21, 2023Updated 2 years ago
- The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-…☆50May 14, 2025Updated 10 months ago
- CleanUMamba: A Compact Mamba Network for Speech Denoising using Channel Pruning [Official PyTorch implementation]☆22Jun 12, 2025Updated 9 months ago
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated 11 months ago
- ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level…☆27Dec 4, 2024Updated last year
- End-to-end ASR repository for AGI☆20Dec 19, 2025Updated 3 months ago
- ASR, End-to-End, end2end, Speech Recognition, 端到端语音识别☆12Oct 25, 2020Updated 5 years ago
- "MULTIMODAL EMOTION RECOGNITION BASED ON DEEP TEMPORAL FEATURES USING CROSS-MODAL TRANSFORMER AND SELF-ATTENTION" ICASSP'23☆23Feb 26, 2023Updated 3 years ago
- INTERSPEECH 2023: "DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models"☆117Jan 26, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆61Apr 4, 2024Updated last year
- A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5☆49Mar 19, 2025Updated last year
- The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"☆50Apr 7, 2025Updated 11 months ago
- Code repository for the paper "Improving End-to-End SLU performance with Prosodic Attention and Distillation" accepted at Interspeech 202…☆27May 17, 2023Updated 2 years ago
- Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation (INTERSPEECH 2020)☆146Aug 5, 2022Updated 3 years ago
- wenet_LLM_from_ASLP☆15Nov 26, 2024Updated last year
- A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/ab…☆36Feb 10, 2024Updated 2 years ago