AutoCorpus is a set of utilities that enable automatic extraction of language corpora and language models from publicly available datasets. Autocorpus utilities follow the Unix design philosophy and integrate easily into custom data processing pipelines.
☆37Feb 1, 2012Updated 14 years ago
Alternatives and similar repositories for AutoCorpus
Users that are interested in AutoCorpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- natural language processing with link-grammar☆18Sep 30, 2009Updated 16 years ago
- This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speech…☆17Mar 6, 2023Updated 3 years ago
- Parallel Tracking and Multiple Mapping☆14Jan 2, 2012Updated 14 years ago
- This is application for dysarthria to improve their pronunciation by using deep learning☆10Dec 29, 2020Updated 5 years ago
- Perform the forced decoding with target transcription☆11Sep 12, 2018Updated 7 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- steps to perform text-based speaker diarization with kaldi toolkit☆12Nov 2, 2018Updated 7 years ago
- Coqui STT (🐸STT) based forced alignment tool☆13Feb 24, 2022Updated 4 years ago
- ☆13Nov 16, 2022Updated 3 years ago
- ☆18Apr 28, 2021Updated 4 years ago
- visual monocular SLAM☆22Jan 3, 2012Updated 14 years ago
- This is a mirror of https://gitlab.com/tiro-is/tiro-speech-core☆15Jun 19, 2023Updated 2 years ago
- Speech Processing & Linguistic Analysis Tool☆11Jun 30, 2019Updated 6 years ago
- Easier analysis of large speech corpora☆23Jun 22, 2021Updated 4 years ago
- Grapheme to phoneme toolkit using joint-modelling + CRFs in java☆14Jul 14, 2018Updated 7 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Tools for working with the CMU Pronunciation Dictionary☆36Sep 5, 2017Updated 8 years ago
- Simple script that crawls the Android Marketplace☆36Jan 15, 2016Updated 10 years ago
- wake word spotting with kaldi☆19Dec 3, 2020Updated 5 years ago
- A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.☆15May 19, 2020Updated 5 years ago
- Deploy Kaldi models using grpc for bidirectional streaming.☆17Sep 30, 2024Updated last year
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Feb 15, 2024Updated 2 years ago
- An app that graphs and compares the pitch contours of spoken language, to help language learners perfect their intonation (Hackbright Spr…☆31Jul 20, 2017Updated 8 years ago
- Recurrent Neural Network language modeling toolkit☆38Jan 23, 2014Updated 12 years ago
- An Online Logic Assistant Based on Coq☆25Feb 15, 2012Updated 14 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- BurrMill core☆22Nov 2, 2021Updated 4 years ago
- SVM Classifier to Detect Sentiment of Tweets☆16Apr 20, 2015Updated 10 years ago
- 📖 LanMIT: A Toolkit for Improving Language Models in Low-resourced Speech Recognition based on Kaldi.☆22Jul 12, 2019Updated 6 years ago
- A free & open tool for transcribing audio interviews with offline ASR support☆25Dec 21, 2023Updated 2 years ago
- A handy dataset of noises for ASR☆22May 29, 2019Updated 6 years ago
- Discussion Summarization is the process of condensing a text document which is a collection of discussion threads, using CBS (Cluster Bas…☆12Apr 10, 2014Updated 12 years ago
- Java interfaces and tools for Kaldi speech recognition.☆20Oct 2, 2016Updated 9 years ago
- Open Source AI Database for Voice Agent Transcripts | Call Analysis & Insights | Extraction | Labelling & Classification☆23Nov 3, 2025Updated 5 months ago
- Phonetic and phonological vocoding platform☆17Nov 23, 2016Updated 9 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for Harmonic Exponential Families on Manifolds☆10Jun 2, 2016Updated 9 years ago
- Audio Diarization Annotation tool☆30Nov 8, 2019Updated 6 years ago
- Social media crawling engine implemented totally browser side in JS☆98Jul 10, 2016Updated 9 years ago
- ☆13Aug 26, 2019Updated 6 years ago
- Deep Boltzmann Machines in R^N dimensions☆11May 14, 2014Updated 11 years ago
- Simple rules based grapheme to phoneme in Python☆11Sep 2, 2017Updated 8 years ago
- Phone-level evaluation of L2 speakers (GOP algorithm)☆27Mar 1, 2017Updated 9 years ago