cdimascio / py-readability-metrics
π Score text readability using a number of formulas: Flesch-Kincaid Grade Level, Gunning Fog, ARI, Dale Chall, SMOG, and more
β361Updated 2 months ago
Related projects β
Alternatives and complementary repositories for py-readability-metrics
- LexRank algorithm for text summarizationβ229Updated 7 months ago
- python package to calculate readability statistics of a text object - paragraphs, sentences, articles.β1,152Updated 5 months ago
- Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenizatiβ¦β661Updated 8 months ago
- Linguistic Inquiry and Word Count (LIWC) analyzerβ193Updated 2 years ago
- Implementation of the ClausIE information extraction system for python+spacyβ220Updated 2 years ago
- Fixes contractions such as `you're` to `you are`β312Updated 2 years ago
- A python module for English lemmatization and inflection.β261Updated last year
- PYthon Automated Term Extractionβ305Updated last year
- GSDMM: Short text clusteringβ353Updated last year
- Repository for TweetEvalβ357Updated 2 years ago
- βοΈContextual word checker for better suggestions (not actively maintained)β409Updated last month
- The SentiWordNet sentiment lexiconβ322Updated 2 years ago
- A multilingual lexicon of words to hurt.β80Updated 2 weeks ago
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interfaceβ249Updated 2 months ago
- A Python library for calculating a large variety of metrics from textβ315Updated last month
- Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?β512Updated 3 weeks ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.β230Updated 2 years ago
- Target-dependent sentiment classification in news articles reporting on political events. Includes a high-quality data set of over 11k seβ¦β140Updated 11 months ago
- Multilingual Rapid Automatic Keyword Extraction (RAKE) for Pythonβ268Updated last year
- Cleans Reddit Text Dataβ81Updated 4 years ago
- This is a simple Python package for calculating a variety of lexical diversity indicesβ65Updated last year
- Elegant and Easy Tweet Preprocessing in Pythonβ305Updated last year
- Text tokenization and sentence segmentation (segtok v2)β203Updated 2 years ago
- A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherβ¦β1,203Updated 10 months ago
- BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)β575Updated 3 months ago
- Annotated dataset of 100 works of fiction to support tasks in natural language processing and the computational humanities.β342Updated last year
- Text Mining and Topic Modeling Toolkit for Python with parallel processing powerβ193Updated last year
- Compute Sentence Embeddings Fast!β618Updated last year
- Catalog of abusive language data (PLoS 2020)β304Updated 5 months ago
- Fuzzy matching and more functionality for spaCy.β252Updated 4 months ago