pgcorpus / gutenberg-analysis
Analysis of gutenberg dataset
☆40Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for gutenberg-analysis
- Information and data related to the ProtestNews shared task at CASE @ ACL-IJCNLP 2021 workshop☆43Updated 2 years ago
- ☆22Updated last year
- Analyze Argumentation and Rhetorical Aspects in Scientific Writing.☆19Updated 2 years ago
- TopicScan: Visualization and validation interface for NMF Topic Modeling☆23Updated 4 years ago
- This repository contains papers and resources pertaining to Hate speech research.☆43Updated 3 years ago
- This is a simple Python package for calculating a variety of lexical diversity indices☆65Updated last year
- ParaNames: A multilingual resource for parallel names☆30Updated 6 months ago
- The Universal Decompositional Semantics (UDS) dataset and the Decomp toolkit☆56Updated last year
- ☆54Updated 2 years ago
- Code for the paper "Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora", ACL 2020.☆18Updated 4 years ago
- ☆64Updated last year
- Training Temporal Word Embeddings with a Compass☆64Updated last year
- An implementation of GrASP (Shnarch et. al., 2017)☆21Updated 2 years ago
- Harassment Lexicon and Corpus☆27Updated 6 years ago
- Unsupervised method for extracting quotation-speaker pairs from large news corpora.☆28Updated 6 years ago
- Dynamic ensemble decoding with transformer-based models☆29Updated last year
- Python tools for text☆15Updated 4 years ago
- ☆17Updated last year
- Preprocessing and analysis for training SNOMED-CT concept embeddings from CORD-19 corpus☆14Updated last year
- Multilingual Open Text☆25Updated 3 weeks ago
- Word Sense Induction with BERT MLM☆28Updated last year
- Wikipedia based dataset to train relationship classifiers and fact extraction models☆25Updated 3 years ago
- ☆73Updated 3 years ago
- Linguistic and stylistic complexity measures for (literary) texts☆78Updated 10 months ago
- Use BERT to Fill in the Blanks☆82Updated 2 years ago
- ☆40Updated 4 years ago
- English Small World of Words SWOWEN-2018☆66Updated 2 years ago
- This repository hosts the code for a tokenizer of tweets.☆12Updated 5 years ago
- XED multilingual emotion datasets☆56Updated last year
- This repository contains the code for the Form-Context Model and its Attentive Mimicking variant.☆31Updated 4 years ago