cdimascio / py-readability-metrics
📗 Score text readability using a number of formulas: Flesch-Kincaid Grade Level, Gunning Fog, ARI, Dale Chall, SMOG, and more
☆378Updated 7 months ago
Alternatives and similar repositories for py-readability-metrics:
Users that are interested in py-readability-metrics are comparing it to the libraries listed below
- A Python library for calculating a large variety of metrics from text☆337Updated 4 months ago
- python package to calculate readability statistics of a text object - paragraphs, sentences, articles.☆1,271Updated 2 months ago
- Implementation of the ClausIE information extraction system for python+spacy☆222Updated 2 years ago
- Linguistic Inquiry and Word Count (LIWC) analyzer☆211Updated 3 years ago
- A module to compute textual lexical richness (aka lexical diversity).☆106Updated last year
- Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?☆520Updated 6 months ago
- Fixes contractions such as `you're` to `you are`☆318Updated 2 years ago
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆255Updated 8 months ago
- A python module for English lemmatization and inflection.☆268Updated last year
- This is a simple Python package for calculating a variety of lexical diversity indices☆75Updated last year
- PYthon Automated Term Extraction☆311Updated 2 years ago
- Google USE (Universal Sentence Encoder) for spaCy☆184Updated 2 years ago
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further lang…☆195Updated 2 years ago
- Cleans Reddit Text Data☆83Updated 5 years ago
- ✔️Contextual word checker for better suggestions (not actively maintained)☆413Updated 3 months ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆245Updated 2 years ago
- Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer…☆387Updated last year
- Code and experiments for *BERTopic: Neural topic modeling with a class-based TF-IDF procedure*☆75Updated last year
- This repository contains EmoBank, a large-scale text corpus manually annotated with emotion according to the psychological Valence-Arousa…☆204Updated 2 years ago
- Text tokenization and sentence segmentation (segtok v2)☆202Updated 3 years ago
- GSDMM: Short text clustering☆355Updated 2 years ago
- Annotated dataset of 100 works of fiction to support tasks in natural language processing and the computational humanities.☆356Updated 2 years ago
- LexRank algorithm for text summarization☆230Updated last year
- LASER multilingual sentence embeddings as a pip package☆223Updated last year
- A multilingual lexicon of words to hurt.☆89Updated 6 months ago
- Linguistic and stylistic complexity measures for (literary) texts☆81Updated last year
- A Dataset of German Legal Documents for Named Entity Recognition☆168Updated 2 years ago
- A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coher…☆1,228Updated 3 months ago
- Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a docum…☆262Updated 6 months ago
- 🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.☆849Updated 8 months ago