PaddlePaddle / PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
☆11,165Updated this week
Related projects ⓘ
Alternatives and complementary repositories for PaddleSpeech
- VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech☆6,876Updated 11 months ago
- A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity…☆7,005Updated this week
- Production First and Production Ready End-to-End Speech Recognition Toolkit☆4,183Updated last week
- TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, Germa…☆3,842Updated 4 months ago
- Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)☆9,407Updated last year
- A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统☆7,853Updated last month
- 🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time☆35,327Updated this week
- An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"☆1,840Updated last year
- DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code☆4,329Updated last year
- [CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation☆11,992Updated 4 months ago
- End-to-End Speech Processing Toolkit☆8,510Updated this week
- This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion☆4,753Updated 4 months ago
- 中文语音识别; Mandarin Automatic Speech Recognition;☆1,879Updated 3 months ago
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆35,508Updated this week
- 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production☆35,482Updated 3 months ago
- Robust Speech Recognition via Large-Scale Weak Supervision☆71,523Updated last week
- Faster Whisper transcription with CTranslate2☆12,540Updated this week
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆6,327Updated this week
- ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.☆9,431Updated 4 months ago
- DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Ras…☆25,366Updated 2 months ago
- An open-source tool-augmented conversational language model from Fudan University☆11,961Updated 4 months ago
- PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html☆2,052Updated last year
- Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!☆1,163Updated 9 months ago
- Core Engine of Singing Voice Conversion & Singing Voice Clone☆2,674Updated 6 months ago
- ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型☆40,706Updated 4 months ago
- This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Mult…☆10,817Updated 3 weeks ago
- Multilingual Voice Understanding Model☆3,450Updated last month
- High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model☆8,479Updated 3 months ago
- Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key☆6,334Updated this week