ageitgey / GutenbergLinks
A simple interface to the Project Gutenberg corpus.
☆17Updated 9 years ago
Alternatives and similar repositories for Gutenberg
Users that are interested in Gutenberg are comparing it to the libraries listed below
Sorting:
- Command-line corpus tools☆9Updated 8 years ago
- Maps clauses from a text corpus onto the metrical structure of a poem☆17Updated 9 years ago
- Code for EMNLP 2016 paper: Morphological Priors for Probabilistic Word Embeddings☆53Updated 8 years ago
- Using embedding-based loss functions for phonetics/speech recognition.☆17Updated 10 years ago
- Jupyter extension to visualize dependency structures☆28Updated 7 years ago
- Multilingual Language Modeling Toolkit☆11Updated 8 years ago
- Regex like pattern tree matching but on sentence's tree instead of Strings☆42Updated 7 years ago
- Python SDK for the TextRazor Text Analytics API☆20Updated last year
- Fast Word Clustering Software☆78Updated 3 months ago
- Normalizes lexically ill-formed text to its most likely clean text, e.g. "c u thr 2nite!" -> "see you there tonight!".☆63Updated 9 years ago
- A temporal ordering system for events and time expressions in written text.☆43Updated 3 years ago
- A python wrapper for Semaphore, a Shallow Semantic Parser that identifies roles in a text.☆12Updated 11 years ago
- Deep learning model of machine translation using attentional and structural biases☆13Updated 7 years ago
- High-coverage and high-precision lexica of terms annotated with emotion scores for English and Italian.☆153Updated 7 months ago
- The repository for the paper "When Do You Need Billions of Words of Pretraining Data?"☆21Updated 4 years ago
- A Combinatory Categorial Grammar library.☆22Updated 11 years ago
- Non-distributional linguistic word vector representations.☆62Updated 7 years ago
- A tool for detecting sentence fragments.☆7Updated 8 years ago
- A Recurrent Neural Network trained on all existing TED Talk Transcripts. The model outputs machine generated TED Talks.☆51Updated 7 years ago
- The Non-Official Characterization (NOC) List is a knowledge-base containing semantic triples about famous people, living and dead, fictio…☆24Updated 6 years ago
- This dataset contains naturally-occurring English sentences that feature non-trivial noun-verb ambiguity.☆35Updated 6 years ago
- ADS Project☆14Updated 9 years ago
- Code for morphological transformations☆29Updated 8 years ago
- Exploring the shapes of stories using indico sentiment analysis APIs☆28Updated 9 years ago
- A re-implementation of redpony/cdec's tokenize-anything.pl script in python☆8Updated 9 years ago
- Code to reproduce experiments in "A Grounded Unsupervised Universal Part-of-Speech Tagger for Low-Resource Languages"☆9Updated 6 years ago
- Support library for NLP and machine learning.☆26Updated 8 years ago
- ☆97Updated 3 years ago
- This is the text partitioner project for Python.☆21Updated 6 years ago
- Neural Network for Automatic Negation Detection☆20Updated 8 years ago