0nutation / SLMTokBench
SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"
☆32Updated last year
Related projects: ⓘ
- Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs☆33Updated last week
- Official release of StyleTalk dataset.☆53Updated 2 months ago
- Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning☆46Updated 10 months ago
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆74Updated last year
- An unofficial PyTorch implementation of Mix-Phoneme-Bert☆39Updated last year
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆41Updated last year
- Code and pretrained models for "DUB: Discrete Unit Back-translation for Speech Translation" (ACL 2023 Findings)☆25Updated last year
- ☆22Updated 2 months ago
- ☆35Updated 2 years ago
- The open source code for LLM-Codec☆106Updated last month
- Implementation of the paper "Self-supervised Learning with Random-projection Quantizer for Speech Recognition" in Pytorch.☆56Updated last year
- ☆33Updated 5 months ago
- SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words☆34Updated 2 months ago
- ☆62Updated 8 months ago
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆40Updated 2 months ago
- ☆30Updated last year
- ConMamba for Automatic Speech Recognition☆38Updated last month
- AudioBench: A Universal Benchmark for Audio Large Language Models☆61Updated 2 weeks ago
- FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis (Accepted by ISCSLP'2024)☆21Updated 6 months ago
- An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).☆47Updated 3 months ago
- Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation☆23Updated 6 months ago
- ☆26Updated last year
- ☆41Updated 2 months ago
- [AAAI 2024] CTX-txt2vec, the acoustic model in UniCATS☆60Updated 6 months ago
- ☆44Updated last year
- [ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"☆32Updated last month
- Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dial…☆38Updated last week
- E2E TTS using Conditional Flow Matching (Experimental*)☆65Updated 10 months ago
- A spoken version of the textual story cloze benchmark☆12Updated last year
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆127Updated last year