raduangelescu / gutenbergpyLinks
Gutenberg cache and query library
☆37Updated 11 months ago
Alternatives and similar repositories for gutenbergpy
Users that are interested in gutenbergpy are comparing it to the libraries listed below
Sorting:
- a python package for cleaning Gutenberg books and dataset☆34Updated last month
- A tool for analyzing the word histories of a text.☆34Updated 7 months ago
- Multilingual syllable annotation pipeline component for spacy☆39Updated 2 years ago
- This is a collection of sentence-level aligned Sanskrit-Tibetan Etexts.☆15Updated 3 years ago
- Next-generation Punkt sentence boundary detection with zero dependencies☆17Updated 2 months ago
- Poetic processing, for Python.☆42Updated last year
- An experiment replicating part of "Why Literary Time is Measured in Minutes" with GPT-4.☆33Updated 2 years ago
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆19Updated 2 years ago
- an experimental implementation of Burrow's delta in Python 3☆21Updated 3 years ago
- JSON representation of the Zotero data model☆55Updated 4 months ago
- A tropes scraper☆35Updated 2 years ago
- A Python scraping module, that extracts text from articles found in RSS feeds. Uses SQLite as database.☆19Updated 11 months ago
- The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheni…☆12Updated last year
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- Python based Wikidata framework for easy dataframe extraction☆44Updated last year
- Scripts for scraping metadata from Project Gutenberg books, via GITenberg.☆19Updated 6 years ago
- Language detection using Spacy and Fasttext☆55Updated last year
- Scrollership through 20m pubmed abstracts.☆26Updated 2 years ago
- A textual corpus database for the digital humanities.☆62Updated 4 years ago
- Reference datasets for folktale motifs, tale types, and annotated texts☆12Updated last month
- Inspect a URL and estimate if it contains a news story☆39Updated 7 months ago
- A maximum-strength name parser for record linkage.☆37Updated 2 weeks ago
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Updated 3 years ago
- Documentation for the GITenberg books project☆29Updated 6 years ago
- This repository provides various Python methods for finding and aggregating synonyms for an individual word or a list of words.☆33Updated 2 years ago
- 🌸 Train floret vectors☆18Updated 2 years ago
- ☆55Updated last year
- A News Article Collection Library☆22Updated 2 years ago
- WordWanderer – take your text for a walk☆12Updated 6 years ago
- Convert ALTO XML to plain text + minimal metadata☆16Updated 8 months ago