Speechlib is a library that unifies speaker diarization, transcription and speaker recognition in a single pipeline to create transcripts for audio conversations with actual speaker names and time tags. This library also contains audio preprocessor functions.
☆265Apr 19, 2026Updated last month
Alternatives and similar repositories for speechlib
Users that are interested in speechlib are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper☆5,525Feb 23, 2026Updated 3 months ago
- Skribify is a powerful transcription and summarization tool that leverages the power of OpenAI's GPT-4 and WhisperAI to generate concise …☆12Apr 29, 2025Updated last year
- A testing repo to share code and thoughts on diarisation☆57Mar 26, 2024Updated 2 years ago
- ☆491Sep 10, 2025Updated 8 months ago
- Acoustic echo cancelation(AEC) is a main algorithm in the pipe line of acoustic devices with KWS or ASR. FNLMS is used.☆19Apr 22, 2019Updated 7 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- On-device speaker diarization powered by deep learning☆72May 8, 2026Updated 2 weeks ago
- 💬 ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.☆219Oct 30, 2024Updated last year
- The WhisperX API is a containerized solution for transcribing audio files using the powerful `whisperx` model. This API provides an easy-…☆18Aug 24, 2023Updated 2 years ago
- Whisper from OpenAi and diarization with Pyannote☆51Jan 7, 2024Updated 2 years ago
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection☆952Jun 3, 2025Updated 11 months ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- ez audio transcription tool with flexible processing and post-processing options☆167Feb 1, 2024Updated 2 years ago
- Speaker diarization service☆27Feb 24, 2026Updated 3 months ago
- WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)☆22,043Apr 4, 2026Updated last month
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker…☆9,953May 19, 2026Updated last week
- The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …☆179May 7, 2026Updated 2 weeks ago
- Generate an OpenAPI/Swagger specification from your SQL database☆15Feb 27, 2026Updated 2 months ago
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆11Nov 6, 2024Updated last year
- Estimating the Age, Height, and Gender of a speaker with their speech signal.☆14Sep 19, 2022Updated 3 years ago
- Open-source reproducible benchmarks from Argmax☆88May 12, 2026Updated 2 weeks ago
- An open-source Kazakh Emotional Text-to-Speech Dataset☆36Aug 1, 2025Updated 9 months ago
- ☆673Sep 24, 2025Updated 8 months ago
- ☆326Jun 14, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Synchronize Whisper's timestamps over an existing accurate transcription☆164May 28, 2024Updated last year
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts☆350Nov 12, 2024Updated last year
- Open TTS models, built for streaming on the edge☆45Mar 16, 2025Updated last year
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆547Nov 6, 2023Updated 2 years ago
- A collection of custom tools and extensions for Open WebUI that enhance its capabilities☆12Dec 11, 2024Updated last year
- [APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Oct 19, 2022Updated 3 years ago
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆36Aug 30, 2025Updated 8 months ago
- This script is an automated survey bot that conducts political discussions over phone calls. It uses Flask, Twilio's Voice API, OpenAI's …☆12Sep 21, 2023Updated 2 years ago
- ☆357Mar 17, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- How to use OpenAIs Whisper to transcribe and diarize audio files☆377Oct 12, 2022Updated 3 years ago
- A nearly-live implementation of OpenAI's Whisper.☆4,031May 15, 2026Updated last week
- Docker image for WhisperX by Max Bain☆13Sep 24, 2025Updated 8 months ago
- A python package to build AI-powered real-time audio applications☆1,975Feb 12, 2025Updated last year
- turnkey self-hosted offline transcription and diarization service with llm summary☆933Jan 18, 2026Updated 4 months ago
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at…☆449Aug 12, 2025Updated 9 months ago
- Automatic Speech Recognition (ASR) system for the Samrómur speech corpus using Kaldi☆12Sep 30, 2022Updated 3 years ago