speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names.
☆252Feb 10, 2026Updated last month
Alternatives and similar repositories for speechlib
Users that are interested in speechlib are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Skribify is a powerful transcription and summarization tool that leverages the power of OpenAI's GPT-4 and WhisperAI to generate concise …☆12Apr 29, 2025Updated 10 months ago
- A testing repo to share code and thoughts on diarisation☆57Mar 26, 2024Updated 2 years ago
- ☆492Sep 10, 2025Updated 6 months ago
- Acoustic echo cancelation(AEC) is a main algorithm in the pipe line of acoustic devices with KWS or ASR. FNLMS is used.☆19Apr 22, 2019Updated 6 years ago
- On-device speaker diarization powered by deep learning☆69Mar 20, 2026Updated last week
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- 💬 ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.☆217Oct 30, 2024Updated last year
- The WhisperX API is a containerized solution for transcribing audio files using the powerful `whisperx` model. This API provides an easy-…☆17Aug 24, 2023Updated 2 years ago
- Whisper from OpenAi and diarization with Pyannote☆51Jan 7, 2024Updated 2 years ago
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection☆927Jun 3, 2025Updated 9 months ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- ez audio transcription tool with flexible processing and post-processing options☆165Feb 1, 2024Updated 2 years ago
- WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)☆20,821Mar 17, 2026Updated last week
- Speaker diarization service☆28Feb 24, 2026Updated last month
- Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker…☆9,398Mar 12, 2026Updated 2 weeks ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …☆170Dec 12, 2025Updated 3 months ago
- Open-source reproducible benchmarks from Argmax☆83Mar 12, 2026Updated 2 weeks ago
- Generate an OpenAPI/Swagger specification from your SQL database☆15Feb 27, 2026Updated last month
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆11Nov 6, 2024Updated last year
- Estimating the Age, Height, and Gender of a speaker with their speech signal.☆14Sep 19, 2022Updated 3 years ago
- An open-source Kazakh Emotional Text-to-Speech Dataset☆35Aug 1, 2025Updated 7 months ago
- ☆666Sep 24, 2025Updated 6 months ago
- ☆324Jun 14, 2024Updated last year
- Synchronize Whisper's timestamps over an existing accurate transcription☆163May 28, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts☆348Nov 12, 2024Updated last year
- Open TTS models, built for streaming on the edge☆45Mar 16, 2025Updated last year
- [APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Oct 19, 2022Updated 3 years ago
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆543Nov 6, 2023Updated 2 years ago
- A collection of custom tools and extensions for Open WebUI that enhance its capabilities☆12Dec 11, 2024Updated last year
- This script is an automated survey bot that conducts political discussions over phone calls. It uses Flask, Twilio's Voice API, OpenAI's …☆12Sep 21, 2023Updated 2 years ago
- A simple Python + Tkinter + Tesseract-based GUI image-to-text copypaste pad application☆11Sep 14, 2023Updated 2 years ago
- ☆357Mar 17, 2024Updated 2 years ago
- How to use OpenAIs Whisper to transcribe and diarize audio files☆374Oct 12, 2022Updated 3 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- A nearly-live implementation of OpenAI's Whisper.☆3,914Mar 17, 2026Updated last week
- Docker image for WhisperX by Max Bain☆12Sep 24, 2025Updated 6 months ago
- A python package to build AI-powered real-time audio applications☆1,958Feb 12, 2025Updated last year
- turnkey self-hosted offline transcription and diarization service with llm summary☆923Jan 18, 2026Updated 2 months ago
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at…☆443Aug 12, 2025Updated 7 months ago
- Automatic Speech Recognition (ASR) system for the Samrómur speech corpus using Kaldi☆12Sep 30, 2022Updated 3 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago