linto-ai / linto-stt
An automatic speech recognition API
☆57Updated last week
Alternatives and similar repositories for linto-stt:
Users that are interested in linto-stt are comparing it to the libraries listed below
- On-device speaker diarization powered by deep learning☆44Updated this week
- Various speech datasets made available to the public☆116Updated 4 months ago
- Model for recasing and repunctuating ASR transcripts☆133Updated last year
- A curated list of awesome voice activity detection☆50Updated 5 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆148Updated last year
- Speaker diarization service☆21Updated 3 weeks ago
- Python server for communicating with Kaldi from the browser using WebRTC☆69Updated last year
- 🐸STT integration examples☆126Updated 2 years ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆100Updated 3 months ago
- Open models for Coqui STT☆138Updated 2 years ago
- Real-Time Whisper Voice Recognition with vosk model feedback.☆112Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆50Updated 10 months ago
- On-device voice activity detection (VAD) powered by deep learning☆213Updated this week
- ONNX Inference of Pyannote Segmentation☆87Updated 4 months ago
- This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsuperv…☆137Updated 4 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆112Updated 2 years ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆102Updated 2 years ago
- ☆11Updated last week
- Speaker diarization model☆27Updated 2 years ago
- Spoken Language Identification on Common Voice and AudioSet using Deep Learning☆39Updated 2 years ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆83Updated last year
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆63Updated last month
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆94Updated last year
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆162Updated 2 weeks ago
- On-device noise suppression powered by deep learning☆69Updated this week
- ☆39Updated last year
- A model that predicts the punctuation of English, Italian, French and German texts.☆80Updated 2 years ago
- C++ version of pyannote audio speaker diarizaiton pipeline☆21Updated last year
- ☆98Updated last week
- ☆46Updated 2 years ago