AGENDD / RWKV-ASRLinks
This repo is an exploratory experiment to enable frozen pretrained RWKV language models to accept speech modality input. We followed the idea of SLAM_ASR and used the RWKV language model as the LLM, and instead of directly writing a prompt template we directly finetuned the initial state of the RWKV model.
☆51Updated 6 months ago
Alternatives and similar repositories for RWKV-ASR
Users that are interested in RWKV-ASR are comparing it to the libraries listed below
Sorting:
- RWKV-SpeechChat is a real-time dialogue script based on a frozen 3B RWKV model with trained adapters and initial states. Various trained …☆27Updated 5 months ago
- This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).☆76Updated last week
- Official implementation of the TTS model Lina-Speech☆165Updated 5 months ago
- Official code for "F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization"☆85Updated 3 weeks ago
- flow mirror models from JZX AI Labs☆44Updated 8 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆98Updated 8 months ago
- StyleTTS 2 Optimized Training Fork☆31Updated 4 months ago
- Official release of StyleTalk dataset.☆66Updated 11 months ago
- Llasa Speed Up☆35Updated 3 weeks ago
- ☆28Updated 4 months ago
- Streamable Text-to-Speech model using a language modeling approach, without vector quantization☆92Updated last month
- ☆13Updated last year
- An unofficial PyTorch implementation of VALL-E☆87Updated 3 weeks ago
- RWKV-LM-V7(https://github.com/BlinkDL/RWKV-LM) Under Lightning Framework☆28Updated last week
- A large-scale speech corpus introduced in Spark-TTS, built from diverse open-source datasets for training text-to-speech (TTS) systems.☆76Updated last month
- A TTS Trained on Universal Audio.☆34Updated 2 weeks ago
- Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".☆24Updated 11 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆73Updated 7 months ago
- ☆21Updated 7 months ago
- ☆67Updated 9 months ago
- Official Code for ParrotTTS☆51Updated 8 months ago
- ☆50Updated 2 months ago
- Implementation of Google's USM speech model in Pytorch☆31Updated 2 months ago
- Inference code for Audiodec-Valle-Wenetspeech4TTS☆50Updated 11 months ago
- ☆129Updated last week
- ☆36Updated last month
- Export an ONNX graph that performs ISTFT. Designed for TTS models.☆24Updated last year
- Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English.☆101Updated 3 months ago
- All generative model in one for better TTS model☆71Updated 9 months ago
- ☆18Updated last year