skit-ai/slu-prosody

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/skit-ai/slu-prosody)

skit-ai / slu-prosody

Code repository for the paper "Improving End-to-End SLU performance with Prosodic Attention and Distillation" accepted at Interspeech 2023.

☆27

Alternatives and similar repositories for slu-prosody

Users that are interested in slu-prosody are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jasonppy / word-discovery
View on GitHub
Word Discovery in Visually Grounded, Self-Supervised Speech Models
☆27Dec 4, 2023Updated 2 years ago
Observeai-Research / Phoneme-BERT
View on GitHub
☆34Jun 15, 2021Updated 5 years ago
ashi-ta / speechGLUE
View on GitHub
SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.
☆13Jun 2, 2023Updated 3 years ago
skit-ai / emotion-tts-dataset
View on GitHub
Dataset release for Emotional TTS in Indian Accent
☆41Mar 25, 2026Updated 4 months ago
Splend1d / T5lephone
View on GitHub
Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5
☆19Nov 29, 2022Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
liyunlongaaa / AD-TUNING
View on GitHub
AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in th…
☆11Feb 23, 2024Updated 2 years ago
skit-ai / phone-number-entity-dataset
View on GitHub
Dataset Release for Phone Number Entity capture task
☆14Sep 2, 2022Updated 3 years ago
voidful / asrp
View on GitHub
ASR text preprocessing utility
☆21Aug 5, 2024Updated last year
vectominist / spin
View on GitHub
Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…
☆65May 19, 2023Updated 3 years ago
nervjack2 / Speech2Unit
View on GitHub
☆13Sep 25, 2024Updated last year
skit-ai / speech-to-intent-dataset
View on GitHub
Dataset Release for Intent Classification from Speech
☆48Feb 23, 2025Updated last year
skit-ai / Map-Mix
View on GitHub
The official implementation of the method discussed in the paper Improving Spoken Language Identification with Map-Mix(work accepted at I…
☆18Feb 17, 2023Updated 3 years ago
Open-Speech-EkStep / indic-punct
View on GitHub
☆45Dec 15, 2022Updated 3 years ago
sinhat98 / adapter-wavlm
View on GitHub
☆46Feb 16, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Speech-Lab-IITM / CCC-wav2vec-2.0
View on GitHub
Code for the method proposed in the paper:- ccc-wav2vec 2.0: Clustering aided Cross-Contrastive learning of Self-Supervised speech repres…
☆23Mar 18, 2024Updated 2 years ago
sungnyun / ARMHuBERT
View on GitHub
(Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT
☆41Aug 29, 2024Updated last year
Fuann / hmamba
View on GitHub
Towards Efficient and Multifaceted Computer-assisted Pronunciation Training Leveraging Hierarchical Selective State Space Model and Decou…
☆16May 6, 2025Updated last year
mct10 / CoBERT
View on GitHub
Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning
☆48Nov 8, 2023Updated 2 years ago
skit-ai / N-Best-ASR-Transformer
View on GitHub
Code for ACL-IJCNLP 2021 paper "N-Best-ASR-Transformer: Enhancing SLU Performance using Multiple ASR Hypotheses."
☆17Nov 30, 2021Updated 4 years ago
cageyoko / CTC-Attention-Mispronunciation
View on GitHub
A Full Text-Dependent End to End Mispronunciation Detection and Diagnosis with Easy Data Augment Techniques
☆64Apr 29, 2021Updated 5 years ago
rhasspy / tts-prompts
View on GitHub
Phonetically balanced text to speech sentences
☆10Aug 16, 2021Updated 4 years ago
Open-Speech-EkStep / data-acquisition-pipeline
View on GitHub
☆18Apr 28, 2021Updated 5 years ago
ga642381 / SpeechPrompt
View on GitHub
**Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…
☆102Apr 10, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
prairie-schooner / wav2vec-vc
View on GitHub
☆10Mar 22, 2023Updated 3 years ago
juice500ml / dysarthria-gop
View on GitHub
Official implementation of the paper "Speech Intelligibility Assessment of Dysarthric Speech by using Goodness of Pronunciation with Unce…
☆28Mar 13, 2025Updated last year
Speech-Lab-IITM / data2vec-aqc
View on GitHub
Repository having the code and models from the paper: data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student traini…
☆13Mar 18, 2024Updated 2 years ago
slp-rl / WhiStress
View on GitHub
The official repo of "WhiStress: Enriching Transcriptions with Sentence Stress Detection" (Interspeech 2025)
☆39Jul 24, 2025Updated last year
Hertin / WavPrompt
View on GitHub
☆37Jun 30, 2022Updated 4 years ago
JSALT-2022-SSL / superb-prosody
View on GitHub
☆31Jul 13, 2023Updated 3 years ago
MiscellaneousStuff / PhoneLM
View on GitHub
(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.
☆48Sep 4, 2023Updated 2 years ago
ErikEkstedt / conv_ssl
View on GitHub
☆14Feb 9, 2023Updated 3 years ago
atosystem / SSL_Interface
View on GitHub
Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024
☆16Nov 19, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ag1988 / mel-asr
View on GitHub
The accompanying code for "Exploring the limits of decoder-only models trained on public speech recognition corpora" (Ankit Gupta, George…
☆21Oct 11, 2024Updated last year
skit-ai / tech
View on GitHub
Skit's tech website
☆11Jul 1, 2024Updated 2 years ago
amazon-science / contextual-attention-nlm
View on GitHub
Accompanying code for paper "Attention-Based Contextual Language Model Adaptation for Speech Recognition", submitted to ACL 2021.
☆14Jul 25, 2023Updated 3 years ago
h-munakata / Lighthouse-Wrapper-for-Audio-Moment-Retrieval
View on GitHub
☆13Mar 23, 2026Updated 4 months ago
AI4Bharat / DocSim
View on GitHub
Synthetically generate random text document images with ground-truth
☆14Jul 20, 2021Updated 5 years ago
robd003 / sph2pipe
View on GitHub
provide SPHERE-formatted output as well as RIFF, AU, AIFF and raw
☆14Dec 18, 2021Updated 4 years ago
huangruizhe / ConEC
View on GitHub
☆14Jun 17, 2024Updated 2 years ago