hedrergudene/asr-sd-pipeline

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hedrergudene/asr-sd-pipeline)

hedrergudene / asr-sd-pipeline

Speech recognition & diarisation solution with text alignment, deployed in AML pipelines

☆102

Alternatives and similar repositories for asr-sd-pipeline

Users that are interested in asr-sd-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

geekodour / wscribe-editor
View on GitHub
web based editor for subtitles and transcripts
☆147Aug 16, 2024Updated last year
RomanKlimov / faster-whisper-acceleration
View on GitHub
Accelerating faster-whisper single file processing by multiprocessing through parallelization
☆57Apr 18, 2023Updated 3 years ago
Speech-Lab-IITM / Hindi-ASR-Challenge
View on GitHub
🎯 Speech Recognition Challenge by Speech Lab - IIT Madras
☆10Nov 5, 2020Updated 5 years ago
JacobLinCool / whisper-cli
View on GitHub
A CLI speech recognition tool, using OpenAI Whisper, supports audio file transcription and near-realtime microphone input.
☆22Updated this week
allseeteam / whisperx-fastapi
View on GitHub
WhisperX FastAPI integration
☆18Mar 31, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
yuekaizhang / Triton-ASR-Client
View on GitHub
ASR client for Triton ASR Service
☆39Jan 12, 2026Updated 6 months ago
MahmoudAshraf97 / whisper-diarization
View on GitHub
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
☆5,614Feb 23, 2026Updated 5 months ago
mirix / approaches-to-diarisation
View on GitHub
A testing repo to share code and thoughts on diarisation
☆58Mar 26, 2024Updated 2 years ago
fgnt / speaker_reassignment
View on GitHub
Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment
☆14Feb 5, 2025Updated last year
Softcatala / whisper-ctranslate2
View on GitHub
Whisper command line client compatible with original OpenAI client based on CTranslate2.
☆1,332Feb 14, 2026Updated 5 months ago
artem-kuchumov / web-speech-recorder
View on GitHub
Record and save audio using a flask app
☆22May 1, 2023Updated 3 years ago
llm-lab-org / CLASP
View on GitHub
CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval
☆13Jun 27, 2025Updated last year
EtienneAb3d / WhisperHallu
View on GitHub
Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts
☆351Nov 12, 2024Updated last year
akashmjn / tinydiarize
View on GitHub
Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens
☆549Nov 6, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
sieve-community / describe
View on GitHub
Incredibly descriptive audiovisual summaries for videos
☆40Aug 2, 2024Updated last year
BBC-Esq / Faster-Whisper-Transcriber
View on GitHub
Record audio or transcribe files using ctranslate2 and whisper!
☆210Jul 20, 2026Updated last week
pengzhendong / speaker-diarization
View on GitHub
Offline Speaker Diarization with SenseVoice by Sherpa ONNX.
☆15Dec 23, 2024Updated last year
lovemefan / paraformer-python
View on GitHub
paraformer(chinense asr) online onnx runtime for python
☆54Mar 27, 2024Updated 2 years ago
ufal / whisper_streaming
View on GitHub
Whisper realtime streaming for long speech-to-text transcription and translation
☆3,657Nov 12, 2025Updated 8 months ago
weijiawu / Polygon-free-Unconstrained-Scene-Text-Detection-with-Box-Annotations
View on GitHub
Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training
☆34Nov 24, 2022Updated 3 years ago
voxos-ai / streaming-whisper-server
View on GitHub
A streaming whisper server for on-prem transcription
☆23Aug 15, 2024Updated last year
yucongzh / online_speaker_diarization
View on GitHub
☆15Jul 11, 2022Updated 4 years ago
alphacep / unimrcp-vosk-plugin
View on GitHub
Open source cross-platform implementation of MRCP protocol
☆20Mar 3, 2022Updated 4 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
frankyoujian / Edge-Punct-Casing
View on GitHub
☆33Feb 4, 2025Updated last year
Mikxox / EnCodec_Trainer
View on GitHub
☆67Apr 3, 2023Updated 3 years ago
lovemefan / CT-Transformer-punctuation
View on GitHub
A enterprise-grade Chinese-English code switch punctuator from funasr.
☆34Apr 26, 2024Updated 2 years ago
FrenchKrab / datasets-pyannote
View on GitHub
Automatically setup the AISHELL-4 and MSDWild dataset for usage with pyannote-database (and pyannote-audio)
☆15Oct 22, 2025Updated 9 months ago
phineas-pta / fine-tune-whisper-vi
View on GitHub
jupyter notebooks to fine tune whisper models on Vietnamese using Colab and/or Kaggle and/or AWS EC2
☆19Aug 15, 2025Updated 11 months ago
reppy4620 / convnext_tts
View on GitHub
Unofficial implementation of ConvNeXt-TTS powered by lightning
☆18Oct 20, 2024Updated last year
Mihaiii / trivia
View on GitHub
A live multiplayer trivia game where users can bid for the subject of the next question
☆29Jan 9, 2026Updated 6 months ago
JaesungHuh / VoxSRC2022
View on GitHub
VoxSRC2022 workshop development kit
☆19Jul 21, 2022Updated 4 years ago
HHousen / speaker-change-detection
View on GitHub
Speaker change detection using SincNet and an LSTM/Transformer
☆57May 26, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
nyrahealth / CrisperWhisper
View on GitHub
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
☆1,031Updated this week
patientx / F5-TTS-ONNX-gui
View on GitHub
Running the F5-TTS by ONNX Runtime standalone with GUI
☆27Dec 10, 2024Updated last year
hyperfocAIs / Attend
View on GitHub
Attend - to what matters.
☆17Feb 22, 2025Updated last year
juanmc2005 / diart
View on GitHub
A python package to build AI-powered real-time audio applications
☆2,007Jun 19, 2026Updated last month
nalbion / whisper-server
View on GitHub
streaming speech to text server using Whisper
☆103Jun 2, 2023Updated 3 years ago
linto-ai / whisper-timestamped
View on GitHub
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
☆2,832Sep 9, 2025Updated 10 months ago
zhuzizyf / damo-fsmn-vad-infer-httpserver
View on GitHub
达摩fsmn vad c++推理服务
☆17Apr 17, 2023Updated 3 years ago