c-w / gutenberg
A simple interface to the Project Gutenberg corpus.
☆323Updated 2 years ago
Alternatives and similar repositories for gutenberg:
Users that are interested in gutenberg are comparing it to the libraries listed below
- I wanted all of plaintext Project Gutenberg in an easy-to-use format, so I made this☆217Updated last year
- A HTTP interface to the Project Gutenberg corpus.☆77Updated 5 years ago
- A simple interface for the CMU pronouncing dictionary☆305Updated 6 months ago
- Analyse rhyme scheme, metre and form of poems☆126Updated 3 years ago
- A corpus of poetry from Project Gutenberg☆195Updated 6 years ago
- A simple Python interface for Darius Kazemi's Corpora Project.☆120Updated 5 years ago
- Natural language processing pipeline for book-length documents (archival Java version; for current Python version, see: https://github.co…☆311Updated 3 years ago
- A command-line program to download text corpora.☆34Updated 7 years ago
- ☆97Updated 3 years ago
- A dataset containing story plots from Wikipedia (books, movies, etc.) and the code for the extractor.☆314Updated 7 years ago
- Pipeline to generate the Standardized Project Gutenberg Corpus☆167Updated last year
- ☆30Updated 7 years ago
- Python scripts for retrieving CSV data from the Google Ngram Viewer and plotting it in XKCD style. The Python script for retrieving ngram…☆253Updated 4 years ago
- A textual corpus database for the digital humanities.☆60Updated 4 years ago
- Python port of Kate Compton's Tracery text expansion library.☆256Updated 11 months ago
- a python package for cleaning Gutenberg books and dataset☆34Updated last year
- Sample implementation of a politeness model, trained on the Stanford Politeness Corpus☆148Updated 2 years ago
- Official releases of the PROIEL treebank of ancient Indo-European languages☆37Updated last year
- Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.☆281Updated 2 months ago
- A Python module for interfacing with the Treetagger by Helmut Schmid.☆75Updated 3 years ago
- NLTK Contrib☆166Updated 11 months ago
- a collection of functions that measure the readability of a given body of text☆191Updated 7 years ago
- Metadata from Project Gutenberg☆41Updated last month
- Tools for parsing and querying Wikimedia Foundation pageview data from both static dumps and the online API.☆65Updated 2 years ago
- I have this big list of links to text stuff that I like, so I thought I'd make it into a repository.☆67Updated 6 years ago
- A command-line tool for interacting with books in git☆110Updated 5 months ago
- PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, an…☆479Updated last year
- A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books fo…☆104Updated 6 years ago
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).☆61Updated last month
- Python library for reading and writing warc files☆239Updated 2 years ago