AutoCorpus is a set of utilities that enable automatic extraction of language corpora and language models from publicly available datasets. Autocorpus utilities follow the Unix design philosophy and integrate easily into custom data processing pipelines.
☆37Feb 1, 2012Updated 14 years ago
Alternatives and similar repositories for AutoCorpus
Users that are interested in AutoCorpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- natural language processing with link-grammar☆18Sep 30, 2009Updated 16 years ago
- This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speech…☆17Mar 6, 2023Updated 3 years ago
- Parallel Tracking and Multiple Mapping☆14Jan 2, 2012Updated 14 years ago
- This is application for dysarthria to improve their pronunciation by using deep learning☆10Dec 29, 2020Updated 5 years ago
- Perform the forced decoding with target transcription☆11Sep 12, 2018Updated 7 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- steps to perform text-based speaker diarization with kaldi toolkit☆12Nov 2, 2018Updated 7 years ago
- Coqui STT (🐸STT) based forced alignment tool☆13Feb 24, 2022Updated 4 years ago
- ☆13Nov 16, 2022Updated 3 years ago
- ☆18Apr 28, 2021Updated 5 years ago
- visual monocular SLAM☆22Jan 3, 2012Updated 14 years ago
- This is a mirror of https://gitlab.com/tiro-is/tiro-speech-core☆15Jun 19, 2023Updated 2 years ago
- Speech Processing & Linguistic Analysis Tool☆11Jun 30, 2019Updated 6 years ago
- Easier analysis of large speech corpora☆24Jun 22, 2021Updated 4 years ago
- Grapheme to phoneme toolkit using joint-modelling + CRFs in java☆14Jul 14, 2018Updated 7 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Tools for working with the CMU Pronunciation Dictionary☆36Sep 5, 2017Updated 8 years ago
- Simple script that crawls the Android Marketplace☆36Jan 15, 2016Updated 10 years ago
- A proxy service to retrieve POIs (Points Of Interest) from several public services (Nominatim, Mapquest, Cloudmade, Geonames, Panoramio, …☆28May 20, 2022Updated 3 years ago
- Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-gramma…☆21Jan 24, 2022Updated 4 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆14Jan 24, 2017Updated 9 years ago
- wake word spotting with kaldi☆19Dec 3, 2020Updated 5 years ago
- A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.☆15May 19, 2020Updated 5 years ago
- Deploy Kaldi models using grpc for bidirectional streaming.☆17Sep 30, 2024Updated last year
- A Python 3 compiler that anyone can understand.☆68Jul 9, 2014Updated 11 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Feb 15, 2024Updated 2 years ago
- An app that graphs and compares the pitch contours of spoken language, to help language learners perfect their intonation (Hackbright Spr…☆31Jul 20, 2017Updated 8 years ago
- An Online Logic Assistant Based on Coq☆25Feb 15, 2012Updated 14 years ago
- BurrMill core☆22Nov 2, 2021Updated 4 years ago
- SVM Classifier to Detect Sentiment of Tweets☆16Apr 20, 2015Updated 11 years ago
- 📖 LanMIT: A Toolkit for Improving Language Models in Low-resourced Speech Recognition based on Kaldi.☆22Jul 12, 2019Updated 6 years ago
- Access public YouTube data feeds from your Node.js apps☆99Oct 15, 2016Updated 9 years ago
- ☆22Jul 8, 2021Updated 4 years ago
- A free & open tool for transcribing audio interviews with offline ASR support☆25Dec 21, 2023Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Discussion Summarization is the process of condensing a text document which is a collection of discussion threads, using CBS (Cluster Bas…☆12Apr 10, 2014Updated 12 years ago
- A handy dataset of noises for ASR☆22May 29, 2019Updated 6 years ago
- ☆12Nov 16, 2021Updated 4 years ago
- small python app to help practice speech shadowing, helpful for language learning☆15Jun 25, 2020Updated 5 years ago
- exploration of reflective architectures in Scheme☆21May 20, 2022Updated 3 years ago
- Phonetic and phonological vocoding platform☆17Nov 23, 2016Updated 9 years ago
- Code from the paper Reflection for the Masses by Charlotte Herzeel, Pascal Costanza, and Theo D'Hondt.☆15Jun 21, 2021Updated 4 years ago