Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning
☆48Nov 8, 2023Updated 2 years ago
Alternatives and similar repositories for CoBERT
Users that are interested in CoBERT are comparing it to the libraries listed below
Sorting:
- ☆13Sep 25, 2024Updated last year
- LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT☆74Sep 26, 2022Updated 3 years ago
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆40Aug 29, 2024Updated last year
- Word Discovery in Visually Grounded, Self-Supervised Speech Models☆26Dec 4, 2023Updated 2 years ago
- [NeurIPS 2024] SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words☆56Jun 25, 2024Updated last year
- Code for the method proposed in the paper:- ccc-wav2vec 2.0: Clustering aided Cross-Contrastive learning of Self-Supervised speech repres…☆23Mar 18, 2024Updated last year
- ASR text preprocessing utility☆21Aug 5, 2024Updated last year
- Code repository for the paper "Improving End-to-End SLU performance with Prosodic Attention and Distillation" accepted at Interspeech 202…☆27May 17, 2023Updated 2 years ago
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- [ICASSP 2022] Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection☆25May 18, 2023Updated 2 years ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Jun 2, 2023Updated 2 years ago
- [INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL: Boosting Self-Supervised Speech Representation Learning by In…☆45Mar 25, 2024Updated last year
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆13Oct 11, 2022Updated 3 years ago
- pytorch model for contexless-phoneme prediction from speech audio☆32Oct 30, 2025Updated 4 months ago
- ☆10Oct 20, 2022Updated 3 years ago
- Russian phonetical transcription☆11Nov 19, 2025Updated 3 months ago
- Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5☆19Nov 29, 2022Updated 3 years ago
- ☆25Feb 12, 2023Updated 3 years ago
- AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in th…☆11Feb 23, 2024Updated 2 years ago
- Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together☆46Jul 3, 2025Updated 8 months ago
- Sequence alignement methods with helpers for PyTorch.☆24Nov 30, 2022Updated 3 years ago
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆69Aug 13, 2024Updated last year
- Neural model for prediction of stress position in Russian words☆13Jun 22, 2025Updated 8 months ago
- Cross-Speaker Encoding Network for Multi-talker Speech Recognition☆11Mar 14, 2025Updated 11 months ago
- ☆11Mar 22, 2023Updated 2 years ago
- ☆31Jul 13, 2023Updated 2 years ago
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆77Jun 9, 2023Updated 2 years ago
- A STFT/iSTFT written up in PyTorch using 1D Convolutions☆32Jul 9, 2024Updated last year
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated 11 months ago
- StyleTTS2 + Vocos as a Decoder☆13Mar 24, 2025Updated 11 months ago
- Sylber: Syllabic Embedding Representation of Speech from Raw Audio☆73Mar 17, 2025Updated 11 months ago
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆64May 19, 2023Updated 2 years ago
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆192Jul 12, 2024Updated last year
- Streaming Vocos☆30Jun 10, 2025Updated 8 months ago
- Prosodic Speech Segmentation with Transformers☆26Feb 25, 2024Updated 2 years ago
- ☆14Aug 19, 2024Updated last year
- Accompanying code for paper "Attention-Based Contextual Language Model Adaptation for Speech Recognition", submitted to ACL 2021.☆14Jul 25, 2023Updated 2 years ago
- Clean and modernized implementation of FastSpeech2/LightSpeech using IPA☆18Aug 16, 2024Updated last year