c-w / gutenbergLinks
A simple interface to the Project Gutenberg corpus.
☆328Updated 2 years ago
Alternatives and similar repositories for gutenberg
Users that are interested in gutenberg are comparing it to the libraries listed below
Sorting:
- A dataset containing story plots from Wikipedia (books, movies, etc.) and the code for the extractor.☆315Updated 7 years ago
- I wanted all of plaintext Project Gutenberg in an easy-to-use format, so I made this☆223Updated 2 years ago
- Python scripts for retrieving CSV data from the Google Ngram Viewer and plotting it in XKCD style. The Python script for retrieving ngram…☆254Updated 4 years ago
- Natural language processing pipeline for book-length documents (archival Java version; for current Python version, see: https://github.co…☆315Updated 3 years ago
- A corpus of poetry from Project Gutenberg☆204Updated 6 years ago
- A simple interface for the CMU pronouncing dictionary☆314Updated 11 months ago
- Collection of tools for building diachronic/historical word vectors☆437Updated last year
- A simple Python interface for Darius Kazemi's Corpora Project.☆120Updated 5 years ago
- A command-line program to download text corpora.☆34Updated 7 years ago
- A HTTP interface to the Project Gutenberg corpus.☆77Updated 5 years ago
- Analyse rhyme scheme, metre and form of poems☆131Updated 4 years ago
- a collection of functions that measure the readability of a given body of text☆195Updated 7 years ago
- ☆97Updated 3 years ago
- Tools for parsing and querying Wikimedia Foundation pageview data from both static dumps and the online API.☆65Updated 3 years ago
- Wordnik Python public library☆232Updated 6 years ago
- Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.☆283Updated 4 months ago
- Scraper for downloading the entire ebooks repository of project Gutenberg☆151Updated this week
- A Python module for interfacing with the Treetagger by Helmut Schmid.☆75Updated last month
- Wikidata client library for Python☆355Updated last year
- Sample implementation of a politeness model, trained on the Stanford Politeness Corpus☆147Updated 3 years ago
- A toolkit for corpus linguistics☆204Updated 6 years ago
- analyze text with empath☆333Updated 8 years ago
- Various utilities for processing the data.☆210Updated last week
- NLTK Contrib☆166Updated last year
- Wikipedia tools (for Humans): easily extract data from Wikipedia, Wikidata, and other MediaWikis☆586Updated last year
- ☆210Updated 4 years ago
- A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books fo…☆107Updated 6 years ago
- This is a mirror of the script by Giuseppe Attardi, and contains history before the official repo started: https://github.com/attardi/wik…☆259Updated 8 years ago
- Python port of Kate Compton's Tracery text expansion library.☆257Updated last year
- A large corpus of discourse annotations and relations on ~10K forum threads.☆240Updated 6 years ago