c-w / gutenberg
A simple interface to the Project Gutenberg corpus.
☆326Updated 2 years ago
Alternatives and similar repositories for gutenberg:
Users that are interested in gutenberg are comparing it to the libraries listed below
- A HTTP interface to the Project Gutenberg corpus.☆77Updated 5 years ago
- Analyse rhyme scheme, metre and form of poems☆130Updated 3 years ago
- Python scripts for retrieving CSV data from the Google Ngram Viewer and plotting it in XKCD style. The Python script for retrieving ngram…☆253Updated 4 years ago
- A dataset containing story plots from Wikipedia (books, movies, etc.) and the code for the extractor.☆316Updated 7 years ago
- Natural language processing pipeline for book-length documents (archival Java version; for current Python version, see: https://github.co…☆315Updated 3 years ago
- A simple interface for the CMU pronouncing dictionary☆311Updated 8 months ago
- A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books fo…☆107Updated 6 years ago
- ☆97Updated 3 years ago
- A textual corpus database for the digital humanities.☆62Updated 4 years ago
- A corpus of poetry from Project Gutenberg☆202Updated 6 years ago
- A toolkit for corpus linguistics☆205Updated 5 years ago
- A command-line program to download text corpora.☆34Updated 7 years ago
- Scraper for downloading the entire ebooks repository of project Gutenberg☆148Updated 2 weeks ago
- Collection of tools for building diachronic/historical word vectors☆431Updated last year
- Tools for parsing and querying Wikimedia Foundation pageview data from both static dumps and the online API.☆65Updated 3 years ago
- A Python Wiktionary Parser☆359Updated 2 months ago
- A large corpus of discourse annotations and relations on ~10K forum threads.☆239Updated 6 years ago
- LingPy: Python library for quantitative tasks in historical linguistics☆133Updated last month
- Import tables from any Wikipedia article as a dataset in Python☆291Updated 3 years ago
- Python package for stylometry☆63Updated 4 years ago
- System for building, visualizing, and working with LDA topic models☆96Updated 3 weeks ago
- Pipeline to generate the Standardized Project Gutenberg Corpus☆182Updated last year
- ☆146Updated 8 years ago
- A Python module for interfacing with the Treetagger by Helmut Schmid.☆75Updated 3 years ago
- The WordSeer text analysis tool, written in Flask.☆43Updated 9 years ago
- The Art of Literary Text Analysis☆165Updated 6 years ago
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).☆63Updated 3 weeks ago
- Practical Approaches to Data Science with Text☆39Updated 5 years ago
- wpcorpus - NLP corpus based on Wikipedia's full article dump☆97Updated 9 years ago
- Python wrapper library for the Datamuse API☆78Updated 2 years ago