Azure-Samples / cognitive-services-speech-sdkLinks

Sample code for the Microsoft Cognitive Services Speech SDK

☆3,241

Alternatives and similar repositories for cognitive-services-speech-sdk

Users that are interested in cognitive-services-speech-sdk are comparing it to the libraries listed below

Sorting:

Azure-Samples / Cognitive-Speech-TTS
Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.
☆966Updated 3 weeks ago
Azure-Samples / AzureSpeechReactSample
This sample shows how to integrate the Azure Speech service into a sample React application. This sample shows design pattern examples fo…
☆157Updated last year
Azure-Samples / openai
The repository for all Azure OpenAI Samples complementing the OpenAI cookbook.
☆1,245Updated 2 weeks ago
microsoft / SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
☆1,379Updated last year
mozilla / DeepSpeech-examples
Examples of how to use or integrate DeepSpeech
☆854Updated last year
Azure-Samples / Cognitive-Services-Voice-Assistant
Welcome to the Microsoft Voice Assistant samples repository! Here you will find samples to help you get started building client applicati…
☆118Updated last year
wiseman / py-webrtcvad
Python interface to the WebRTC Voice Activity Detector
☆2,296Updated last year
FACEGOOD / FACEGOOD-Audio2Face
http://www.facegood.cc
☆1,883Updated 2 years ago
RapidAI / RapidASR
📣 商用级开源语音自动识别程序库，开箱即用，全平台支持，中英文混合识别。A Cross-platform implementation of ASR inference. It's based on ONNXRuntime and FunASR. We provide …
☆563Updated last year
snakers4 / silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
☆6,289Updated last month
k2-fsa / sherpa-ncnn
Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, …
☆1,406Updated last month
wenet-e2e / wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
☆4,671Updated last week
pyannote / pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker…
☆7,881Updated last week
resemble-ai / Resemblyzer
A python package to analyze and compare voices with deep learning
☆3,033Updated last year
google / live-transcribe-speech-engine
Live Transcribe is an Android application that provides real-time captioning for people who are deaf or hard of hearing. This repository …
☆1,464Updated 2 years ago
Azure-Samples / aoai-realtime-audio-sdk
Azure OpenAI code resources for using gpt-4o-realtime capabilities.
☆822Updated last month
jaywalnut310 / vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
☆7,556Updated last year
MontrealCorpusTools / Montreal-Forced-Aligner
Command line utility for forced alignment using Kaldi
☆1,530Updated this week
BytedanceSpeech / seed-tts-eval
☆1,368Updated last year
ranchlai / mandarin-tts
Chinese Mandarin tts text-to-speech 中文 (普通话) 语音合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder, with biaobei …
☆477Updated 3 years ago
PaddlePaddle / PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text fronten…
☆12,072Updated 3 weeks ago
jianfch / stable-ts
Transcription, forced alignment, and audio indexing with OpenAI's Whisper
☆1,936Updated 2 months ago
juanmc2005 / diart
A python package to build AI-powered real-time audio applications
☆1,363Updated 5 months ago
pndurette / gTTS
Python library and CLI tool to interface with Google Translate's text-to-speech API
☆2,482Updated last month
Azure / gen-cv
Vision AI Solution Accelerator
☆432Updated 2 months ago
Azure-Samples / aisearch-openai-rag-audio
A simple example implementation of the VoiceRAG pattern to power interactive voice generative AI experiences using RAG with Azure AI Sear…
☆467Updated last month
modelscope / KAN-TTS
KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-…
☆513Updated last year
ahmetoner / whisper-asr-webservice
OpenAI Whisper ASR Webservice API
☆2,732Updated 2 weeks ago
coqui-ai / STT
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
☆2,471Updated last year
shibing624 / parrots
Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine. 中英语音识别、多角色语音合成，支持多语言，准确率高
☆497Updated 7 months ago