mozilla / DeepSpeechLinks

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

☆26,534

Alternatives and similar repositories for DeepSpeech

Users that are interested in DeepSpeech are comparing it to the libraries listed below

Sorting:

flashlight / wav2letter
Facebook AI Research's Automatic Speech Recognition Toolkit
☆6,434Updated 8 months ago
mozilla / TTS
Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
☆9,914Updated last year
kaldi-asr / kaldi
kaldi-asr/kaldi is the official location of the Kaldi project.
☆15,000Updated this week
coqui-ai / STT
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
☆2,478Updated last year
espnet / espnet
End-to-End Speech Processing Toolkit
☆9,314Updated last week
common-voice / common-voice
Common Voice is part of Mozilla's initiative to help teach machines how real people speak.
☆3,391Updated this week
alphacep / vosk-api
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
☆12,781Updated last week
buriburisuri / speech-to-text-wavenet
Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's WaveNet and tensorflow
☆3,982Updated 3 years ago
speechbrain / speechbrain
A PyTorch-based Speech Toolkit
☆10,144Updated last week
Uberi / speech_recognition
Speech recognition module for Python, supporting several engines and APIs, online and offline.
☆8,808Updated 2 months ago
commaai / openpilot
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.
☆55,523Updated this week
zzw922cn / Automatic_Speech_Recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
☆2,841Updated 2 years ago
SeanNaren / deepspeech.pytorch
Speech Recognition using DeepSpeech2.
☆2,127Updated 2 years ago
keithito / tacotron
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
☆2,981Updated 2 years ago
RasaHQ / rasa
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, …
☆20,423Updated last week
mozilla / DeepSpeech-examples
Examples of how to use or integrate DeepSpeech
☆854Updated 2 years ago
cmusphinx / pocketsphinx
A small speech recognizer
☆4,156Updated this week
coqui-ai / TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
☆41,599Updated 11 months ago
CMU-Perceptual-Computing-Lab / openpose
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
☆32,815Updated 11 months ago
pannous / tensorflow-speech-recognition
🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
☆2,171Updated last year
deeppavlov / DeepPavlov
An open source library for deep learning end-to-end dialog systems and chatbots.
☆6,913Updated this week
pyannote / pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker…
☆7,916Updated 2 weeks ago
julius-speech / julius
Open-Source Large Vocabulary Continuous Speech Recognition Engine
☆1,896Updated last month
openai / gpt-3
GPT-3: Language Models are Few-Shot Learners
☆15,762Updated 4 years ago
explosion / spaCy
💫 Industrial-strength Natural Language Processing (NLP) in Python
☆32,010Updated last month
neonbjb / tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
☆14,440Updated 8 months ago
PaddlePaddle / PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text fronten…
☆12,089Updated this week
zzw922cn / awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synth…
☆3,053Updated last year
NVIDIA / NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Auto…
☆15,208Updated this week
flashlight / flashlight
A C++ standalone library for machine learning
☆5,398Updated 3 months ago