linto-ai / linto-sttLinks
An automatic speech recognition API
☆61Updated this week
Alternatives and similar repositories for linto-stt
Users that are interested in linto-stt are comparing it to the libraries listed below
Sorting:
- On-device speaker diarization powered by deep learning☆51Updated this week
- Add n-gram and large language model (LLM) support to Whisper models.☆21Updated last month
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆115Updated 2 years ago
- ONNX Inference of Pyannote Segmentation☆91Updated 6 months ago
- Tunable pipelines☆34Updated 4 months ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆102Updated 4 months ago
- Speaker diarization service☆23Updated 2 months ago
- On-device voice activity detection (VAD) powered by deep learning☆218Updated this week
- A curated list of awesome voice activity detection☆57Updated 7 months ago
- Create an LJSpeech structured voice dataset on wave input☆30Updated 8 months ago
- ☆40Updated last year
- On-device noise suppression powered by deep learning☆73Updated this week
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆94Updated 5 months ago
- ☆38Updated 3 years ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆149Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆52Updated last month
- 🐸STT integration examples☆129Updated 2 years ago
- Various speech datasets made available to the public☆122Updated 6 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆84Updated last year
- Convert English text from written expressions into spoken forms☆25Updated 3 years ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆165Updated 2 weeks ago
- A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems☆215Updated 4 months ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆102Updated 2 years ago
- Text to speech alignment using CTC forced alignment☆300Updated 3 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆99Updated 8 months ago
- python wrapper for rnnoise library☆48Updated 2 years ago
- A non-native English corpus for pronunciation scoring task☆143Updated 11 months ago
- This repository creates speaker diarization recipes to be used within the egs folder of kaldi.☆17Updated 10 months ago
- Speaker diarization python system based on binary key speaker modelling☆60Updated 3 years ago
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆63Updated 2 months ago