This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language Modeling" (ICASSP 2023)
☆20Jan 3, 2023Updated 3 years ago
Alternatives and similar repositories for SLM-Discrete-Representations
Users that are interested in SLM-Discrete-Representations are comparing it to the libraries listed below
Sorting:
- Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5☆19Nov 29, 2022Updated 3 years ago
- ☆31Jul 13, 2023Updated 2 years ago
- A spoken version of the textual story cloze benchmark☆20Aug 6, 2023Updated 2 years ago
- The official repo of the paper "StressTest: Can YOUR Speech LM Handle the Stress?"☆20Jul 9, 2025Updated 7 months ago
- ASR text preprocessing utility☆21Aug 5, 2024Updated last year
- Official implementation of MelHuBERT☆68Feb 21, 2026Updated last week
- We introduce the LLAMA1 Test Set, a comprehensive open-domain world knowledge QA dataset for evaluating question-answering systems. We pr…☆23Mar 14, 2024Updated last year
- Super Flappy Bird in p5.js☆10Mar 8, 2021Updated 4 years ago
- The official code for the SALMon🍣 benchmark (ICASSP 2025 - Oral)☆49Aug 15, 2025Updated 6 months ago
- A pitch detection model trained to be robust against noise and reverberation environments.☆27Jan 21, 2025Updated last year
- Official repository for "Speaking Style Conversion With Discrete Self-Supervised Units" (EMNLP 2023). https://arxiv.org/abs/2212.09730☆131Dec 8, 2023Updated 2 years ago
- ☆13Sep 25, 2024Updated last year
- This repo contains the official PyTorch implementation of "A Systematic Comparison of Phonetic Aware Techniques for Speech Enhancement" (…☆28Aug 8, 2022Updated 3 years ago
- A repo to do interpretability of pre-trained acoustic models☆15Oct 15, 2023Updated 2 years ago
- Automatic Measurement of Vowel Duration for Consonant Vowel Consonant (CVC) sound files (JASA 2016)☆14Feb 25, 2017Updated 9 years ago
- Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023☆27Apr 27, 2023Updated 2 years ago
- A method that directly addresses the modality gap by aligning speech token with the corresponding text transcription during the tokenizat…☆114Sep 3, 2025Updated 6 months ago
- Rate-Adaptive Quantization: A Multi-Rate Codebook Adaptation for Vector Quantization-based Generative Models☆15Sep 10, 2025Updated 5 months ago
- Taiwanese Translation with BERT based model and RNN. Collection of Taiwanese text corpus☆13Oct 15, 2022Updated 3 years ago
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024☆16Nov 19, 2024Updated last year
- Which ML are you?☆13Jan 3, 2023Updated 3 years ago
- ☆12Jul 23, 2024Updated last year
- Code for "Phoneme Segmentation Using Self-Supervised Speech Models", Strgar & Harwath, Proceedings of the IEEE Spoken Language Technology…☆55Nov 4, 2022Updated 3 years ago
- Unsupervised spoken sentence embeddings☆14Dec 14, 2022Updated 3 years ago
- Fast and differentiable hidden Markov model in C++☆19Jan 20, 2023Updated 3 years ago
- Interspeech Tutorial - Resource Efficient and Cross-Modal Learning Toward Foundation Modeling☆15Oct 9, 2023Updated 2 years ago
- Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023☆12May 13, 2024Updated last year
- ☆41May 15, 2023Updated 2 years ago
- Comprehensive quantitative comparison of lossless and lossy audio codecs☆39Feb 11, 2023Updated 3 years ago
- ☆32Nov 24, 2024Updated last year
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆37Aug 29, 2023Updated 2 years ago
- ☆15Sep 9, 2021Updated 4 years ago
- This repo contains the official PyTorch implementation of vLMIG: Improving Visual Commonsense in Language Models via Multiple Image Gener…☆17Jul 1, 2024Updated last year
- ESLTTS dataset☆16Feb 6, 2025Updated last year
- Pitch-shift audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.☆140Sep 25, 2024Updated last year
- A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/ab…☆36Feb 10, 2024Updated 2 years ago
- Official release of StyleTalk dataset.☆72Jul 1, 2024Updated last year
- Yet another minimalist deep-learning framework optimized for inference☆36Updated this week
- CSCW 2023 Best Demo Award: Conversational AI Explanations to Support Human-AI Scientific Writing☆14Jun 25, 2023Updated 2 years ago