ebu / benchmarksttLinks
Open Source AI Benchmarking toolkit for benchmarking speech to text services
☆55Updated last year
Alternatives and similar repositories for benchmarkstt
Users that are interested in benchmarkstt are comparing it to the libraries listed below
Sorting:
- Command line tool to create corpora for Common Voice☆77Updated last year
- Linguistic processing for Common Voice☆55Updated last year
- Spoken Language Identification on Common Voice and AudioSet using Deep Learning☆40Updated 2 years ago
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆107Updated 3 weeks ago
- Forced Alignments for Common Voice☆31Updated 4 years ago
- Crawling and creating a German language model resource☆19Updated 2 years ago
- ☆14Updated 2 years ago
- ☆37Updated last month
- Evaluate results from ASR/Speech-to-Text quickly☆37Updated 3 years ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆102Updated 4 months ago
- ☆22Updated 3 years ago
- Python module to clean and transliterate (i.e. normalize) German text including abbreviations, numbers, timestamps etc. It can be used to…☆33Updated 4 years ago
- Scripts for training Kaldi for German speech recognition (ASR).☆24Updated 4 years ago
- ☆56Updated 2 years ago
- Simple Kaldi model server for chain (nnet3) models in online recognition mode directly from a local microphone☆35Updated 3 years ago
- Various speech datasets made available to the public☆122Updated 6 months ago
- Python wrapper for phonetisaurus grapheme to phoneme tool☆12Updated 4 years ago
- Praaline is an open-source system to manage, annotate, visualise and analyse spoken language corpora☆30Updated 2 years ago
- 🫠 check your data, before you wreck your model☆16Updated 2 years ago
- Audiobook alignment for Indigenous languages☆40Updated this week
- Repository for sharing the data in the Tamasheq language, one of the target languages for the low-resource speech translation track at IW…☆18Updated 2 years ago
- Labeled data for homograph disambiguation☆59Updated 2 years ago
- A lightweight library to compute Diarization Error Rate (DER).☆59Updated last year
- Unicode Standard tokenization routines and orthography profile segmentation☆37Updated 4 months ago
- A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.☆15Updated 5 years ago
- Code for Speaker Change Detection in Broadcast TV using Bidirectional Long Short-Term Memory Networks☆65Updated 4 years ago
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆111Updated 2 years ago
- Advanced data structures for handling temporal segments with attached labels.☆113Updated 4 months ago
- 🐍 Coqui's machine learning job scheduler☆32Updated 3 years ago
- An in-browser app for labeling audio clips at random, using Docker and Flask.☆53Updated 7 years ago