mpacula / AutoCorpusLinks
AutoCorpus is a set of utilities that enable automatic extraction of language corpora and language models from publicly available datasets. Autocorpus utilities follow the Unix design philosophy and integrate easily into custom data processing pipelines.
☆37Updated 13 years ago
Alternatives and similar repositories for AutoCorpus
Users that are interested in AutoCorpus are comparing it to the libraries listed below
Sorting:
- Generalized Language Modeling toolkit☆51Updated 3 years ago
- Theano implementation of the Neural GPU☆15Updated 9 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 2 years ago
- Turning Javascript into a probabilistic programming language☆58Updated 8 years ago
- A Recurrent Neural Network trained on all existing TED Talk Transcripts. The model outputs machine generated TED Talks.☆51Updated 7 years ago
- Speech modeling using code by Kratarth Goel http://dblp.uni-trier.de/pers/hd/g/Goel:Kratarth☆9Updated 10 years ago
- An interactive map of English words, where words with similar meaning appear closer together.☆22Updated 10 years ago
- A Python framework for exploring distributional semantic models.☆85Updated 9 years ago
- Standalone Semanticizer☆32Updated 10 years ago
- Basic dataset for the linguistic data collection.☆15Updated 8 years ago
- Uses a distributed word representation to finds words along the hyperchord of two input words.☆102Updated 5 years ago
- code referenced in "Towards universal neural nets: Gibbs machines and ACE", Galin Georgiev, http://arxiv.org/abs/1508.06585☆14Updated 9 years ago
- Discussion Summarization is the process of condensing a text document which is a collection of discussion threads, using CBS (Cluster Bas…☆12Updated 11 years ago
- Using word2vec and t-SNE to compare text sources.☆20Updated 10 years ago
- Turbo topics find significant multiword phrases in topics.☆46Updated 10 years ago
- The Community-enRiched Open WordNet (CROWN)☆18Updated 9 years ago
- A fork of the sofia ml machine learning program☆14Updated 13 years ago
- Natural Language Question Answering Engine☆33Updated 10 years ago
- *Deprecated* A fast and accurate part-of-speech tagger for TextBlob.☆102Updated 9 years ago
- Visualization for hidden Markov model computations☆14Updated 10 years ago
- Code for morphological transformations☆29Updated 8 years ago
- Random fun with statistical language models.☆64Updated 5 years ago
- Topic Model Analyzer☆62Updated 9 years ago
- Natural Logic Inference for Common Sense Reasoning☆61Updated 6 years ago
- ☆62Updated 11 years ago
- an opinionated assembly of wordnet for javascript☆56Updated 8 years ago
- An implementation of word2vec applied to [stanford philosophy encyclopedia](http://plato.stanford.edu/)☆35Updated 8 years ago
- various simple RNNs trained on synthetic grammars☆30Updated 9 years ago
- Recurrent Neural Network language modeling toolkit☆38Updated 11 years ago
- Common Code Workflow tutorial on Theano☆16Updated 9 years ago