raduangelescu / gutenbergpy
Gutenberg cache and query library
☆37Updated 9 months ago
Alternatives and similar repositories for gutenbergpy:
Users that are interested in gutenbergpy are comparing it to the libraries listed below
- a python package for cleaning Gutenberg books and dataset☆34Updated last week
- An experiment replicating part of "Why Literary Time is Measured in Minutes" with GPT-4.☆32Updated 2 years ago
- A textual corpus database for the digital humanities.☆62Updated 4 years ago
- Poetic processing, for Python.☆40Updated 11 months ago
- Multilingual syllable annotation pipeline component for spacy☆39Updated 2 years ago
- Explore your own text collection with a topic model – without prior knowledge.☆62Updated 4 months ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- Web service to generate citations and bibliographies using citeproc-js☆62Updated last year
- WordWanderer – take your text for a walk☆12Updated 5 years ago
- Libraries, Archives and Museums (LAM)☆82Updated 2 years ago
- DHLAB is a library of python modules for accessing text and pictures at the National Library of Norway.☆22Updated 2 weeks ago
- A simple, accessible, mobile-ready textbook on HCI and Design.☆22Updated 6 months ago
- A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books fo…☆106Updated 6 years ago
- Neo4j powered web application for multimedia collections: bring graph-based exploration and crowd-based indexation.☆24Updated 4 years ago
- A Python library for generating word tree diagrams☆25Updated 4 years ago
- Inspect a URL and estimate if it contains a news story☆39Updated 5 months ago
- An open etymology dataset created using Wiktionary data. Contains 3.8M entries, 1.8M terms, 2900 languages, and 31 unique relationship ty…☆93Updated 11 months ago
- Natural language processing on 12k+ country lyrics🍺☆28Updated 6 years ago
- Interactive Visualization Interface for Multidimensional Datasets☆58Updated 2 months ago
- Jurisdiction ID and abbreviation data files for using with Jurism and other projects.☆36Updated last year
- Scripts that clean up OCR and munge Hathi metadata.☆76Updated 7 years ago
- This repository makes available the Talk of Norway (ToN) dataset, a collection of Norwegian parliament speeches from 1998 to 2016. Every …☆31Updated last year
- A library that provides an ergonomic, DOM-like model for XML encoded text documents.☆17Updated 3 weeks ago
- JSON representation of the Zotero data model☆54Updated 2 months ago
- Easily display Zotero items on a webpage☆32Updated 2 years ago
- Automatically exported from code.google.com/p/guess-language☆53Updated last year
- A lightweight python library for working with Akoma Ntoso documents.☆18Updated 2 weeks ago
- Free-for-all repository of TEI and plain text files for you (to do cool stuff) provided by the Digital Collections Services group at the …☆27Updated 8 years ago
- Add website scraping abilities to Datasette☆62Updated 2 years ago
- Scrollership through 20m pubmed abstracts.☆26Updated last year