Splend1d / T5lephone
Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5
☆19Updated last year
Related projects: ⓘ
- Word Discovery in Visually Grounded, Self-Supervised Speech Models☆24Updated 9 months ago
- This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language M…☆17Updated last year
- ☆30Updated last year
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆41Updated last year
- ☆11Updated 2 weeks ago
- ASR text preprocessing utility☆20Updated last month
- Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning☆46Updated 10 months ago
- ☆23Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆13Updated last year
- DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning☆47Updated 8 months ago
- ☆69Updated this week
- Transformer-based visually grounded speech models☆19Updated last year
- Code for the method proposed in the paper:- ccc-wav2vec 2.0: Clustering aided Cross-Contrastive learning of Self-Supervised speech repres…☆15Updated 6 months ago
- Official implementation of MelHuBERT☆57Updated 2 months ago
- Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.☆35Updated 6 months ago
- ☆35Updated 2 years ago
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆32Updated last year
- Self-Supervised Speech Pre-training and Representation Learning Toolkit.☆8Updated 2 years ago
- AudioCodec-Hub is a Python library for encoding and decoding audio data, supporting various neural audio codec models☆21Updated 11 months ago
- Codes and datasets for our ICASSP2023 paper, Evaluating parameter-efficient transfer learning approaches on SURE benchmark for speech und…☆40Updated last year
- Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless…☆34Updated last year
- This repository presents an evaluation framework for speech-to-speech (S2S) models, following the methodology described in the EmphAsses …☆12Updated 8 months ago
- ASCEND Chinese-English code-switching dataset☆21Updated 2 years ago
- Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together☆42Updated last year
- Keyword spotting and forced alignment in any language☆31Updated 2 months ago
- Repository having the code and models from the paper: data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student traini…☆11Updated 6 months ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!☆31Updated last year
- Taiwanese Speech Synthesis with Tacotron2☆18Updated last year
- ☆20Updated 3 years ago