Speech recognition & diarisation solution with text alignment, deployed in AML pipelines
☆101May 7, 2024Updated 2 years ago
Alternatives and similar repositories for asr-sd-pipeline
Users that are interested in asr-sd-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ez audio transcription tool with flexible processing and post-processing options☆167Feb 1, 2024Updated 2 years ago
- web based editor for subtitles and transcripts☆146Aug 16, 2024Updated last year
- Accelerating faster-whisper single file processing by multiprocessing through parallelization☆56Apr 18, 2023Updated 3 years ago
- Record audio or transcribe files using ctranslate2 and whisper!☆192Apr 28, 2026Updated last week
- ASR client for Triton ASR Service☆39Jan 12, 2026Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper☆5,511Feb 23, 2026Updated 2 months ago
- Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment☆14Feb 5, 2025Updated last year
- Whisper command line client compatible with original OpenAI client based on CTranslate2.☆1,300Feb 14, 2026Updated 2 months ago
- A testing repo to share code and thoughts on diarisation☆57Mar 26, 2024Updated 2 years ago
- ☆11Sep 5, 2025Updated 8 months ago
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 10 months ago
- ☆22Apr 26, 2026Updated last week
- Code and pruned models for our paper: K. Gkrispanis, N. Gkalelis, V. Mezaris, "Filter-Pruning of Lightweight Face Detectors Using a Geome…☆14May 8, 2024Updated 2 years ago
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts☆350Nov 12, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆547Nov 6, 2023Updated 2 years ago
- Running the F5-TTS by ONNX Runtime standalone with GUI☆24Dec 10, 2024Updated last year
- Whisper realtime streaming for long speech-to-text transcription and translation☆3,611Nov 12, 2025Updated 5 months ago
- jupyter notebooks to fine tune whisper models on Vietnamese using Colab and/or Kaggle and/or AWS EC2☆20Aug 15, 2025Updated 8 months ago
- paraformer(chinense asr) online onnx runtime for python☆54Mar 27, 2024Updated 2 years ago
- Transcribe and translate voice into LRC file using Whisper and LLMs (GPT, Claude, et,al). 使用whisper和LLM(GPT,Claude等)来转录、翻译你的音频为字幕文件。☆653Updated this week
- A streaming whisper server for on-prem transcription☆23Aug 15, 2024Updated last year
- Example using OpenTelemetry to instrument a FastAPI / LangGraph / Langchain application☆11Nov 12, 2024Updated last year
- ☆31Feb 4, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Open source cross-platform implementation of MRCP protocol☆20Mar 3, 2022Updated 4 years ago
- A python package to build AI-powered real-time audio applications☆1,974Feb 12, 2025Updated last year
- ☆15Jul 11, 2022Updated 3 years ago
- ☆35Mar 4, 2026Updated 2 months ago
- ☆67Apr 3, 2023Updated 3 years ago
- Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation☆18May 17, 2023Updated 2 years ago
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection☆945Jun 3, 2025Updated 11 months ago
- Github repository for ACL 2025 paper: VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models☆24Jun 16, 2025Updated 10 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Combine search results with a GPT prompt to get answers about current events.☆21Apr 10, 2023Updated 3 years ago
- Multilingual Automatic Speech Recognition with word-level timestamps and confidence☆2,813Sep 9, 2025Updated 8 months ago
- WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)☆21,760Apr 4, 2026Updated last month
- 达摩fsmn vad c++推理服务☆18Apr 17, 2023Updated 3 years ago
- streaming speech to text server using Whisper☆102Jun 2, 2023Updated 2 years ago
- A live multiplayer trivia game where users can bid for the subject of the next question☆29Jan 9, 2026Updated 4 months ago
- Silero VAD: pre-trained enterprise-grade Voice Activity Detector☆8,993Mar 26, 2026Updated last month