ageitgey / GutenbergLinks
A simple interface to the Project Gutenberg corpus.
☆17Updated 9 years ago
Alternatives and similar repositories for Gutenberg
Users that are interested in Gutenberg are comparing it to the libraries listed below
Sorting:
- Fast Word Clustering Software☆78Updated 5 months ago
- Code for EMNLP 2016 paper: Morphological Priors for Probabilistic Word Embeddings☆53Updated 8 years ago
- The RadioTalk dataset of talk radio transcripts☆60Updated 4 years ago
- A clean and easy interface for performing nearest-neighbor lookups☆50Updated 5 years ago
- Python package for stylometry☆63Updated 4 years ago
- A neural network based StoryTeller that outputs a short story from an input image☆13Updated 6 years ago
- Code for learning geographically-informed word embeddings☆22Updated 3 years ago
- An API to access data from The New Yorker Caption Contest☆62Updated 2 years ago
- a python package for cleaning Gutenberg books and dataset☆34Updated 2 months ago
- Polyglot is a language identifier for detecting text documents containing text written in more than one language, and for identifying the…☆32Updated 9 years ago
- ☆34Updated 3 years ago
- Jupyter extension to visualize dependency structures☆28Updated 7 years ago
- GloVe word vector embedding experiments (similar to Word2Vec)☆67Updated 2 years ago
- ☆97Updated 3 years ago
- Code for my blog post on Generating Words from Embeddings☆23Updated 11 months ago
- Python interface for converting Penn Treebank trees to Stanford Dependencies and Universal Depenencies☆70Updated 6 years ago
- A hack to replace Pride & Prejudice text with closest word2vec model word, and visualize results.☆61Updated 10 years ago
- Doing things with embeddings☆66Updated 2 years ago
- Exploring the shapes of stories using indico sentiment analysis APIs☆28Updated 9 years ago
- Non-distributional linguistic word vector representations.☆62Updated 7 years ago
- Featurize words into orthographic and phonological vectors.☆41Updated 2 years ago
- Utility scripts in Python☆37Updated last month
- High-coverage and high-precision lexica of terms annotated with emotion scores for English and Italian.☆154Updated 8 months ago
- The Non-Official Characterization (NOC) List is a knowledge-base containing semantic triples about famous people, living and dead, fictio…☆24Updated 6 years ago
- A Large Automatically-Constructed Resource of Predicate Paraphrases☆45Updated 5 years ago
- Code for morphological transformations☆29Updated 8 years ago
- The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors (COLING 2016…☆68Updated 3 years ago
- Code to reproduce experiments from the EMNLP 2015 paper about Rumour Stance Classification with Gaussian Processes.☆37Updated 9 years ago
- Regex like pattern tree matching but on sentence's tree instead of Strings☆42Updated 7 years ago
- Code and data from our ACL 2014 paper "Humans Require Context to Infer Ironic Intent (so Computers Probably do, too)"☆15Updated 11 years ago