WangHelin1997 / SpeechTasks
This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent speech tool development, and speech applications.
☆73Updated 7 months ago
Alternatives and similar repositories for SpeechTasks:
Users that are interested in SpeechTasks are comparing it to the libraries listed below
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆84Updated 2 months ago
- multilingual speech aligner☆73Updated last year
- 《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm☆81Updated last year
- Awesome Neural Codec Models, Text-to-Speech Synthesizers & Speech Language Models☆95Updated last week
- ☆48Updated 2 months ago
- The open source code for SimpleSpeech series☆122Updated 3 months ago
- ☆63Updated 4 months ago
- Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995☆67Updated last month
- Reference-aware automatic speech evaluation toolkit☆140Updated last month
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆39Updated last year
- The open source code for LLM-Codec☆123Updated 5 months ago
- ConMamba for Automatic Speech Recognition☆54Updated 5 months ago
- Evaluation Protocol for Large-Scale Zero-Shot TTS Literature☆70Updated 4 months ago
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆166Updated 6 months ago
- Implementation of the paper "Self-supervised Learning with Random-projection Quantizer for Speech Recognition" in Pytorch.☆68Updated last year
- LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning☆120Updated 7 months ago
- ☆51Updated last year
- ☆63Updated last year
- An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).☆48Updated 7 months ago
- A list of papers for child ASR☆35Updated 3 months ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆77Updated last month
- [AAAI 2024] Code for CTX-vec2wav in UniCATS☆126Updated 7 months ago
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆140Updated last year
- ☆43Updated last year
- DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning☆47Updated last year
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆49Updated last year
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆51Updated last year
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆49Updated 3 weeks ago
- Code for paper A3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing☆86Updated 4 months ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆64Updated last year