raduangelescu / gutenbergpy
Gutenberg cache and query library
☆35Updated 5 months ago
Alternatives and similar repositories for gutenbergpy:
Users that are interested in gutenbergpy are comparing it to the libraries listed below
- a python package for cleaning Gutenberg books and dataset☆32Updated last year
- A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books fo…☆103Updated 6 years ago
- A tool for analyzing the word histories of a text.☆34Updated last month
- WordWanderer – take your text for a walk☆12Updated 5 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- Python based Wikidata framework for easy dataframe extraction☆41Updated last year
- Datasette plugin for uploading CSV files and converting them to database tables☆25Updated 9 months ago
- Multilingual syllable annotation pipeline component for spacy☆39Updated last year
- ☆54Updated last year
- Keyword spaCy is a spaCy pipeline component for extracting keywords from text using cosine similarity.☆10Updated last year
- An experiment replicating part of "Why Literary Time is Measured in Minutes" with GPT-4.☆32Updated last year
- Poetic processing, for Python.☆40Updated 8 months ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated 10 months ago
- ☆54Updated last year
- ☆27Updated 6 months ago
- an experimental implementation of Burrow's delta in Python 3☆20Updated 3 years ago
- Ontologies of Linguistic Annotation. Machine-readable tagsets and annotation schemata for more than 100 languages.☆20Updated last month
- List of easy American-English words: The New Dale-Chall (1995)☆32Updated 2 years ago
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissions☆19Updated last year
- Discourse Analysis Tool Suite☆18Updated this week
- Text Mining and Topic Modeling Toolkit for Python with parallel processing power☆16Updated last year
- A textual corpus database for the digital humanities.☆60Updated 4 years ago
- A web framework to display Cross Linguistic Linked Data.☆55Updated 2 months ago
- Tools for interactive visual exploration of semantic embeddings.☆29Updated 4 months ago
- 📑 Python Package to reconstruct the original continuous text from PDFs with language models☆32Updated last year
- A Python library for generating word tree diagrams☆25Updated 4 years ago
- SFGram (Science-Fiction Gram) is a dataset of public science-fiction novels, books and movie covers. It is designed to be used by researc…☆30Updated 6 years ago
- [LREC 2020] EtymDB, an Etymological DataBase (v2.1)☆23Updated 3 years ago
- Free-for-all repository of TEI and plain text files for you (to do cool stuff) provided by the Digital Collections Services group at the …☆27Updated 7 years ago
- A simple, accessible, mobile-ready textbook on HCI and Design.☆22Updated 3 months ago