pgcorpus / gutenberg-analysis
Analysis of gutenberg dataset
☆43Updated 6 years ago
Alternatives and similar repositories for gutenberg-analysis:
Users that are interested in gutenberg-analysis are comparing it to the libraries listed below
- Finds linguistic patterns effortlessly☆35Updated last year
- This is a simple Python package for calculating a variety of lexical diversity indices☆69Updated last year
- A Super-Lightweight Annotation Tool for Experts: Label text in a terminal with just Python☆101Updated last month
- Training Temporal Word Embeddings with a Compass☆64Updated 2 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆79Updated 6 months ago
- Linguistic and stylistic complexity measures for (literary) texts☆79Updated last year
- Repository for code and metadata to support work described in "Authorless Topic Models: Biasing Models Away from Known Structure"☆28Updated 4 years ago
- An implementation of GrASP (Shnarch et. al., 2017)☆21Updated 2 years ago
- ☆17Updated last year
- Analyze Argumentation and Rhetorical Aspects in Scientific Writing.☆19Updated 2 years ago
- MinScIE is an Open Information Extraction system which provides structured knowledge enriched with semantic information about citations.☆15Updated 5 years ago
- Code for the paper "Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora", ACL 2020.☆18Updated 4 years ago
- A module to compute textual lexical richness (aka lexical diversity).☆99Updated last year
- Semantically Structured Sentence Embeddings☆66Updated 3 months ago
- Lexicons for the Multilingual UCREL Semantic Analysis System☆40Updated last year
- A small repository to test Captum Explainable AI with a trained Flair transformers-based text classifier.☆26Updated 3 years ago
- BERT and ELECTRA models trained on Europeana Newspapers☆37Updated 3 years ago
- Package to extract connotation frames☆83Updated last year
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.☆86Updated 3 weeks ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- TopicScan: Visualization and validation interface for NMF Topic Modeling☆23Updated 4 years ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated 10 months ago
- CrowdTruth framework for crowdsourcing ground truth for training & evaluation of AI systems☆57Updated 9 months ago
- ☆64Updated last year
- a python package for cleaning Gutenberg books and dataset☆33Updated last year
- The repository for the paper "When Do You Need Billions of Words of Pretraining Data?"☆21Updated 4 years ago
- ☆54Updated 3 years ago
- [LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweeban…☆104Updated last year
- ☆54Updated last year
- 💫 SpaCy wrapper for ConceptNet 💫☆89Updated last year