speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names.
☆252Feb 10, 2026Updated 3 weeks ago
Alternatives and similar repositories for speechlib
Users that are interested in speechlib are comparing it to the libraries listed below
Sorting:
- Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper☆5,395Feb 23, 2026Updated last week
- On-device speaker diarization powered by deep learning☆69Updated this week
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆10Nov 6, 2024Updated last year
- Skribify is a powerful transcription and summarization tool that leverages the power of OpenAI's GPT-4 and WhisperAI to generate concise …☆12Apr 29, 2025Updated 10 months ago
- ☆493Sep 10, 2025Updated 5 months ago
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆13Dec 4, 2024Updated last year
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- A testing repo to share code and thoughts on diarisation☆57Mar 26, 2024Updated last year
- Acoustic echo cancelation(AEC) is a main algorithm in the pipe line of acoustic devices with KWS or ASR. FNLMS is used.☆19Apr 22, 2019Updated 6 years ago
- rmp data ranking☆13Nov 4, 2025Updated 4 months ago
- Docker image for WhisperX by Max Bain☆12Sep 24, 2025Updated 5 months ago
- Speaker diarization service☆28Feb 24, 2026Updated last week
- 💬 ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.☆217Oct 30, 2024Updated last year
- Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR☆78Jun 8, 2025Updated 8 months ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- Open-source reproducible benchmarks from Argmax☆82Feb 20, 2026Updated 2 weeks ago
- Official implementation of the paper titled "Age and Gender Recognition Using a Convolutional Neural Network with a Specially Designed Mu…☆27Mar 5, 2024Updated 2 years ago
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection☆920Jun 3, 2025Updated 9 months ago
- Open TTS models, built for streaming on the edge☆45Mar 16, 2025Updated 11 months ago
- The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …☆166Dec 12, 2025Updated 2 months ago
- [APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Oct 19, 2022Updated 3 years ago
- Automatic Speech Recognition (ASR) system for the Samrómur speech corpus using Kaldi☆12Sep 30, 2022Updated 3 years ago
- ☆14Jul 24, 2025Updated 7 months ago
- The WhisperX API is a containerized solution for transcribing audio files using the powerful `whisperx` model. This API provides an easy-…☆17Aug 24, 2023Updated 2 years ago
- DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors☆36Feb 11, 2025Updated last year
- An open-source Kazakh Emotional Text-to-Speech Dataset☆35Aug 1, 2025Updated 7 months ago
- Whisper from OpenAi and diarization with Pyannote☆51Jan 7, 2024Updated 2 years ago
- ez audio transcription tool with flexible processing and post-processing options☆163Feb 1, 2024Updated 2 years ago
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆539Nov 6, 2023Updated 2 years ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆105Jan 10, 2025Updated last year
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆18May 17, 2024Updated last year
- Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation☆24Nov 12, 2025Updated 3 months ago
- ☆17Apr 14, 2023Updated 2 years ago
- Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker…☆9,274Feb 20, 2026Updated 2 weeks ago
- Python runtime for WeTextProcessing (does not depend on Pynini)☆48Nov 28, 2025Updated 3 months ago
- WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)☆20,368Feb 22, 2026Updated last week
- ☆30Jun 12, 2025Updated 8 months ago
- Estimating the Age, Height, and Gender of a speaker with their speech signal.☆14Sep 19, 2022Updated 3 years ago
- speaker-disentangled speech linguistic content quantizer☆24Mar 19, 2025Updated 11 months ago