hedrergudene / asr-sd-pipelineView external linksLinks
Speech recognition & diarisation solution with text alignment, deployed in AML pipelines
☆100May 7, 2024Updated last year
Alternatives and similar repositories for asr-sd-pipeline
Users that are interested in asr-sd-pipeline are comparing it to the libraries listed below
Sorting:
- ez audio transcription tool with flexible processing and post-processing options☆162Feb 1, 2024Updated 2 years ago
- web based editor for subtitles and transcripts☆144Aug 16, 2024Updated last year
- ASR client for Triton ASR Service☆37Jan 12, 2026Updated last month
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts☆348Nov 12, 2024Updated last year
- A testing repo to share code and thoughts on diarisation☆57Mar 26, 2024Updated last year
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 7 months ago
- 🎯 Speech Recognition Challenge by Speech Lab - IIT Madras☆11Nov 5, 2020Updated 5 years ago
- Code and pruned models for our paper: K. Gkrispanis, N. Gkalelis, V. Mezaris, "Filter-Pruning of Lightweight Face Detectors Using a Geome…☆14May 8, 2024Updated last year
- Automatically setup the AISHELL-4 and MSDWild dataset for usage with pyannote-database (and pyannote-audio)☆15Oct 22, 2025Updated 3 months ago
- Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper☆5,355Nov 26, 2025Updated 2 months ago
- ☆18Feb 4, 2026Updated last week
- Accelerating faster-whisper single file processing by multiprocessing through parallelization☆56Apr 18, 2023Updated 2 years ago
- ☆28Jan 11, 2026Updated last month
- Whisper command line client compatible with original OpenAI client based on CTranslate2.☆1,219Updated this week
- Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation☆18May 17, 2023Updated 2 years ago
- Record audio or transcribe files using ctranslate2 and whisper!☆172Updated this week
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆536Nov 6, 2023Updated 2 years ago
- Cantonese Selfish Project 廣東話自肥企劃 at PYCON HK 2021☆15Feb 16, 2022Updated 4 years ago
- A CLI speech recognition tool, using OpenAI Whisper, supports audio file transcription and near-realtime microphone input.☆22Updated this week
- ☆14Jul 11, 2022Updated 3 years ago
- A python package to build AI-powered real-time audio applications☆1,931Feb 12, 2025Updated last year
- A chrome extention for quering a local llm model using llama-cpp-python, includes a pip package for running the server, 'pip install loca…☆18Oct 9, 2023Updated 2 years ago
- Chat with an AI simulation of anyone as easily as copy-pasting text into a folder!☆18Mar 4, 2023Updated 2 years ago
- Whisper realtime streaming for long speech-to-text transcription and translation☆3,530Nov 12, 2025Updated 3 months ago
- 达摩fsmn vad c++推理服务☆18Apr 17, 2023Updated 2 years ago
- Github repository for ACL 2025 paper: VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models☆24Jun 16, 2025Updated 8 months ago
- VoxSRC2022 workshop development kit☆19Jul 21, 2022Updated 3 years ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆80May 20, 2023Updated 2 years ago
- A streaming whisper server for on-prem transcription☆23Aug 15, 2024Updated last year
- paraformer(chinense asr) online onnx runtime for python☆53Mar 27, 2024Updated last year
- ☆22Oct 27, 2021Updated 4 years ago
- ☆11Aug 2, 2024Updated last year
- Mirror of hf.co/pyannote/speaker-diarization-3.1☆29Jan 7, 2024Updated 2 years ago
- Speaker change detection using SincNet and an LSTM/Transformer☆56May 26, 2025Updated 8 months ago
- Advances in audio anti-spoofing and deepfake detection using graph neural networks and self-supervised learning☆23Aug 20, 2023Updated 2 years ago
- Multilingual Automatic Speech Recognition with word-level timestamps and confidence☆2,759Sep 9, 2025Updated 5 months ago
- ☆28Nov 7, 2023Updated 2 years ago
- Open source implementation for computer use, using light OCR models and LLMs. Get Android app in link below.☆30Updated this week
- streaming speech to text server using Whisper☆101Jun 2, 2023Updated 2 years ago