raduangelescu / gutenbergpyLinks
Gutenberg cache and query library
☆44Updated last month
Alternatives and similar repositories for gutenbergpy
Users that are interested in gutenbergpy are comparing it to the libraries listed below
Sorting:
- Poetic processing, for Python.☆42Updated last year
- This is a collection of sentence-level aligned Sanskrit-Tibetan Etexts.☆15Updated 3 years ago
- a python package for cleaning Gutenberg books and dataset☆34Updated 8 months ago
- Find legal citations in any block of text☆198Updated 3 months ago
- A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books fo…☆114Updated 7 years ago
- poetry from dirty ocr☆62Updated 4 years ago
- spaCy extension for Visual Studio Code☆31Updated 10 months ago
- Reference datasets for folktale motifs, tale types, and annotated texts☆16Updated 7 months ago
- ☆55Updated 2 years ago
- An experiment replicating part of "Why Literary Time is Measured in Minutes" with GPT-4.☆34Updated 2 years ago
- Python wrapper library for the Datamuse API☆82Updated 2 years ago
- ☆116Updated this week
- An open etymology dataset created using Wiktionary data. Contains 3.8M entries, 1.8M terms, 2900 languages, and 31 unique relationship ty…☆143Updated last year
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆78Updated last month
- Jurisdiction ID and abbreviation data files for using with Jurism and other projects.☆39Updated 2 years ago
- A collection of regular expressions for matching citations to state, federal, and even international law☆40Updated 4 years ago
- Inspect a URL and estimate if it contains a news story☆39Updated this week
- A simple interface to the Project Gutenberg corpus.☆331Updated 2 years ago
- Tool for the Automatic Analysis of Syntactic Sophistication and Complexity☆29Updated 2 years ago
- The official repository for the The Project Dialogism Novel Corpus, a dataset of annotated quotations in full-length English novels.☆43Updated 2 years ago
- linguistics backend☆42Updated 2 years ago
- Web service to generate citations and bibliographies using citeproc-js☆64Updated 5 months ago
- A textual corpus database for the digital humanities.☆62Updated 5 years ago
- Scripts that clean up OCR and munge Hathi metadata.☆77Updated 8 years ago
- python library to validate, clean, transform and get metadata of ISBN strings (for devs).☆268Updated last year
- SerendipSlim is a visualization tool for exploring topic models built on large collections of text documents.☆39Updated 7 years ago
- Source files for "An Introduction to VisiData"☆77Updated 10 months ago
- Encoding the Bible in TEI, starting with the Gospels☆26Updated 4 months ago
- A simple collocation-driven recognition of rhymes. Contains pre-trained models for Czech, Dutch, English, French, German, Russian, and Sp…☆32Updated 6 months ago
- Group thousands of similar spreadsheet or database text entries in seconds☆157Updated 2 years ago