openaudiolab / LLaSTLinks
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models
☆25Updated 11 months ago
Alternatives and similar repositories for LLaST
Users that are interested in LLaST are comparing it to the libraries listed below
Sorting:
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆37Updated last year
- Collection of scripts from mHuBERT-147.☆29Updated 7 months ago
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆25Updated 7 months ago
- Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning☆47Updated last year
- Official release of StyleTalk dataset.☆67Updated last year
- ☆33Updated last year
- Temporary anonymous version☆22Updated last year
- ☆35Updated last year
- Code and pretrained models for "DUB: Discrete Unit Back-translation for Speech Translation" (ACL 2023 Findings)☆28Updated 2 years ago
- A TTS Trained on Universal Audio.☆35Updated last month
- This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language M…☆19Updated 2 years ago
- CML-TTS: A Multilingual Dataset for Speech Synthesis☆33Updated 11 months ago
- A spoken version of the textual story cloze benchmark☆17Updated last year
- Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.☆50Updated this week
- ☆11Updated last year
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆59Updated 8 months ago
- ☆21Updated last year
- ☆19Updated last year
- An unofficial PyTorch implementation of Mix-Phoneme-Bert☆39Updated 2 years ago
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆17Updated 11 months ago
- A Neural Audio Codec (NAC) for Universal Audio☆36Updated last month
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆20Updated last month
- [ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels☆38Updated last year
- ☆13Updated 9 months ago
- ☆39Updated 9 months ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Updated 3 years ago
- ☆36Updated 3 years ago
- Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs☆63Updated 3 weeks ago
- Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)☆16Updated 3 months ago
- multilingual speech aligner☆74Updated last year