R1ckShi / SeACo-Paraformer
[ICASSP2023] Source code, model links and open test sets for paper SeACo-Paraformer.
☆27Updated 10 months ago
Alternatives and similar repositories for SeACo-Paraformer:
Users that are interested in SeACo-Paraformer are comparing it to the libraries listed below
- Speech samples and code of BEdit-TTS☆32Updated last year
- (WIP)long form speech generatoins☆29Updated last month
- Optimized loss based on cross-entropy (CE), like MWER (minimum WER) Loss with beam search and negative sampling strategy, Smoothed Max Po…☆20Updated 3 months ago
- AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data☆29Updated last year
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆74Updated last year
- Official repo for CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations☆44Updated 2 weeks ago
- ☆63Updated last year
- An unofficial PyTorch implementation of Mix-Phoneme-Bert☆39Updated last year
- ☆15Updated 6 months ago
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆33Updated last year
- Open Source Speech/Text Data on AI☆18Updated 2 years ago
- An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".☆24Updated last year
- Official release of StyleTalk dataset.☆60Updated 6 months ago
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆60Updated 9 months ago
- ConMamba for Automatic Speech Recognition☆54Updated 5 months ago
- TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization☆93Updated 11 months ago
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆61Updated 2 months ago
- Objective metrics used in several text-to-speech (TTS) papers.☆46Updated 2 years ago
- ☆36Updated last year
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆39Updated last year
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆43Updated 2 months ago
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆84Updated 2 months ago
- We introduce the LLAMA1 Test Set, a comprehensive open-domain world knowledge QA dataset for evaluating question-answering systems. We pr…☆16Updated 10 months ago
- Code and pretrained models for "DUB: Discrete Unit Back-translation for Speech Translation" (ACL 2023 Findings)☆27Updated last year
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆49Updated 3 weeks ago
- ☆44Updated last year
- ☆25Updated 6 months ago