This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language Modeling" (ICASSP 2023)
☆20Jan 3, 2023Updated 3 years ago
Alternatives and similar repositories for SLM-Discrete-Representations
Users that are interested in SLM-Discrete-Representations are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The official repo of the paper "StressTest: Can YOUR Speech LM Handle the Stress?"☆20Jul 9, 2025Updated 8 months ago
- Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5☆19Nov 29, 2022Updated 3 years ago
- A spoken version of the textual story cloze benchmark☆20Aug 6, 2023Updated 2 years ago
- ☆31Jul 13, 2023Updated 2 years ago
- ☆19Jan 8, 2025Updated last year
- The official code for the SALMon🍣 benchmark (ICASSP 2025 - Oral)☆49Aug 15, 2025Updated 7 months ago
- This repo contains the official PyTorch implementation of vLMIG: Improving Visual Commonsense in Language Models via Multiple Image Gener…☆17Jul 1, 2024Updated last year
- ASR text preprocessing utility☆21Aug 5, 2024Updated last year
- Official repository for "Speaking Style Conversion With Discrete Self-Supervised Units" (EMNLP 2023). https://arxiv.org/abs/2212.09730☆131Dec 8, 2023Updated 2 years ago
- This repo contains the official PyTorch implementation of "A Systematic Comparison of Phonetic Aware Techniques for Speech Enhancement" (…☆28Aug 8, 2022Updated 3 years ago
- Code for "Phoneme Segmentation Using Self-Supervised Speech Models", Strgar & Harwath, Proceedings of the IEEE Spoken Language Technology…☆55Nov 4, 2022Updated 3 years ago
- Unsupervised spoken sentence embeddings☆14Dec 14, 2022Updated 3 years ago
- Super Flappy Bird in p5.js☆10Mar 8, 2021Updated 5 years ago
- ☆10Apr 17, 2024Updated last year
- This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …☆88Jun 18, 2024Updated last year
- Official implementation of MelHuBERT☆69Feb 21, 2026Updated last month
- Taiwanese Translation with BERT based model and RNN. Collection of Taiwanese text corpus☆13Oct 15, 2022Updated 3 years ago
- A publishing website of a table collecting meta-learning-related papers in the area of human language processing.☆17Aug 2, 2021Updated 4 years ago
- ☆13Sep 25, 2024Updated last year
- ☆15Sep 9, 2021Updated 4 years ago
- ☆41May 15, 2023Updated 2 years ago
- A pitch detection model trained to be robust against noise and reverberation environments.☆27Jan 21, 2025Updated last year
- Library for Textless Spoken Language Processing☆555Aug 29, 2023Updated 2 years ago
- Official implementation of "Describing Sets of Images with Textual-PCA".☆16Feb 13, 2023Updated 3 years ago
- CSCW 2023 Best Demo Award: Conversational AI Explanations to Support Human-AI Scientific Writing☆14Jun 25, 2023Updated 2 years ago
- We introduce the LLAMA1 Test Set, a comprehensive open-domain world knowledge QA dataset for evaluating question-answering systems. We pr…☆23Mar 14, 2024Updated 2 years ago
- Taiwanese Speech Synthesis with Tacotron2☆25Oct 2, 2022Updated 3 years ago
- Automatic Measurement of Vowel Duration for Consonant Vowel Consonant (CVC) sound files (JASA 2016)☆14Feb 25, 2017Updated 9 years ago
- Interspeech Tutorial - Resource Efficient and Cross-Modal Learning Toward Foundation Modeling☆15Oct 9, 2023Updated 2 years ago
- An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-S…☆416Aug 29, 2023Updated 2 years ago
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆13Oct 11, 2022Updated 3 years ago
- A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/ab…☆36Feb 10, 2024Updated 2 years ago
- A method that directly addresses the modality gap by aligning speech token with the corresponding text transcription during the tokenizat…☆115Sep 3, 2025Updated 6 months ago
- ☆19Apr 28, 2023Updated 2 years ago
- Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023☆12May 13, 2024Updated last year
- An official implementation of ProbeGen☆13Oct 20, 2024Updated last year
- Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023☆27Apr 27, 2023Updated 2 years ago
- TEAL: New Selection Strategy for Small Buffers in Experience Replay Class Incremental Learning☆17Jan 21, 2025Updated last year
- ☆12Jul 23, 2024Updated last year