JonathanReeve / chapterizeLinks
A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books for computational text analysis.
☆114Updated 7 years ago
Alternatives and similar repositories for chapterize
Users that are interested in chapterize are comparing it to the libraries listed below
Sorting:
- Pipeline to generate the Standardized Project Gutenberg Corpus☆207Updated 2 years ago
- Python 3 library for processing historical English☆68Updated last year
- Natural language processing pipeline for book-length documents (archival Java version; for current Python version, see: https://github.co…☆316Updated 3 years ago
- Linguistic and stylistic complexity measures for (literary) texts☆84Updated 2 years ago
- Latin BERT☆69Updated last year
- Python Multilingual Ucrel Semantic Analysis System☆35Updated last week
- Natural language processing resources for multiple languages, with an eye towards use for digital humanities.☆127Updated 4 years ago
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆78Updated this week
- An NLP processing pipeline for characters in fanfiction. Developed by students at Carnegie Mellon University from 2019-2021.☆34Updated last year
- A Python wrapper around the topic modeling functions of MALLET.☆105Updated last year
- Poetic processing, for Python.☆42Updated last year
- Digital Humanities Across Borders☆50Updated last year
- Analyze Argumentation and Rhetorical Aspects in Scientific Writing.☆19Updated 3 years ago
- [LREC 2020] EtymDB, an Etymological DataBase (v2.1)☆25Updated 4 years ago
- High-performance text aligner for large collections of texts☆54Updated this week
- ☆67Updated 5 months ago
- LingPy: Python library for quantitative tasks in historical linguistics☆139Updated last month
- Package to extract connotation frames☆91Updated 2 years ago
- Literary Language Toolkit: code, models, corpora, and web tools☆11Updated last year
- Preliminary spaCy models for Latin☆14Updated 3 years ago
- Detect and align similar passages☆116Updated 4 months ago
- Repository for code and metadata to support work described in "Authorless Topic Models: Biasing Models Away from Known Structure"☆29Updated 5 years ago
- A modern, interlingual wordnet interface for Python☆279Updated last week
- A module to compute textual lexical richness (aka lexical diversity).☆112Updated 2 years ago
- ☆182Updated last year
- A simple interface to the Project Gutenberg corpus.☆331Updated 3 years ago
- Python version for Doug Biber's Multidimensional Analysis (MDA)☆38Updated 2 months ago
- Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.☆21Updated last year
- a python package for cleaning Gutenberg books and dataset☆34Updated 8 months ago
- Multi Tier Annotation Search☆26Updated 4 years ago