Speech-Lab-IITM / CCC-wav2vec-2.0View external linksLinks
Code for the method proposed in the paper:- ccc-wav2vec 2.0: Clustering aided Cross-Contrastive learning of Self-Supervised speech representations
☆23Mar 18, 2024Updated last year
Alternatives and similar repositories for CCC-wav2vec-2.0
Users that are interested in CCC-wav2vec-2.0 are comparing it to the libraries listed below
Sorting:
- ☆13Sep 25, 2024Updated last year
- ASR text preprocessing utility☆21Aug 5, 2024Updated last year
- Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5☆19Nov 29, 2022Updated 3 years ago
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Mar 15, 2023Updated 2 years ago
- Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning☆48Nov 8, 2023Updated 2 years ago
- Code repository for the paper "Improving End-to-End SLU performance with Prosodic Attention and Distillation" accepted at Interspeech 202…☆27May 17, 2023Updated 2 years ago
- ☆10Sep 19, 2022Updated 3 years ago
- steps to perform text-based speaker diarization with kaldi toolkit☆12Nov 2, 2018Updated 7 years ago
- A toolkit for Spoken Language Understanding Evaluation (SLUE) benchmark. Refer paper https://arxiv.org/abs/2111.10367 for more details. O…☆66Feb 26, 2024Updated last year
- ☆31Jul 13, 2023Updated 2 years ago
- Word Discovery in Visually Grounded, Self-Supervised Speech Models☆26Dec 4, 2023Updated 2 years ago
- S3PRL for Speech Emotion Recognition (see s3prl > downstream)☆15Feb 5, 2025Updated last year
- Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra☆16Dec 10, 2024Updated last year
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Jun 2, 2023Updated 2 years ago
- Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together☆46Jul 3, 2025Updated 7 months ago
- Pronunciation-assisted Subword Modeling☆31May 30, 2019Updated 6 years ago
- Artie Bias Corpus: an audio corpus + code for detecting demographic bias☆20Jul 21, 2020Updated 5 years ago
- Source Code for the Paper "UNIFIED KEYWORD SPOTTING AND AUDIO TAGGING ON MOBILE DEVICES WITH TRANSFORMERS"☆23Mar 6, 2023Updated 2 years ago
- Taiwanese Speech Synthesis with Tacotron2☆25Oct 2, 2022Updated 3 years ago
- LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT☆74Sep 26, 2022Updated 3 years ago
- A JAX library for building lattice-based speech transducer models☆46Jan 8, 2026Updated last month
- Phonetically-Oriented Word Error Rate☆36May 4, 2019Updated 6 years ago
- [ICASSP 2022] Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection☆25May 18, 2023Updated 2 years ago
- Data and code related to the ICASSP submission "A comparison of methods for OOV-word recognition"☆17Nov 28, 2021Updated 4 years ago
- This repository presents an evaluation framework for speech-to-speech (S2S) models, following the methodology described in the EmphAsses …☆24Jan 9, 2024Updated 2 years ago
- Super Flappy Bird in p5.js☆10Mar 8, 2021Updated 4 years ago
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆40Aug 29, 2024Updated last year
- PodcastMix A dataset for separating music and speech in podcasts.☆44Aug 20, 2024Updated last year
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- The Android application providing user with REST-based interface for utilizing built-in Android's TTS engine. The web service is highly c…☆11Jul 28, 2020Updated 5 years ago
- ☆10Apr 17, 2024Updated last year
- ☆11Nov 5, 2021Updated 4 years ago
- Docker for building an environment for Dutch online and offline ASR.☆12Feb 2, 2021Updated 5 years ago
- AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in th…☆11Feb 23, 2024Updated last year
- Using YouTube to prepare a speech recognition dataset for any language☆10Mar 30, 2021Updated 4 years ago
- Multilingual acoustic word embedding approaches applied and evaluated on GlobalPhone data.☆11Nov 3, 2020Updated 5 years ago
- Convert words to numbers☆21Apr 13, 2022Updated 3 years ago
- Prosodic Speech Segmentation with Transformers☆26Feb 25, 2024Updated last year
- Transfer learning approach to pronunciation scoring☆11Jan 17, 2024Updated 2 years ago