dominiksinsaarland / DocSCAN
Learning from Neighbors: Unsupervised Text Classification
☆17Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for DocSCAN
- Repository for the CommonLit Ease of Readability Corpus☆21Updated 7 months ago
- Package to extract connotation frames☆80Updated 11 months ago
- Repository for the paper Us vs. Them: A Dataset of Populist Attitudes, News Bias and Emotions☆16Updated 5 months ago
- Project repository of the paper "Less Annotating, More Classifying – Addressing the Data Scarcity Issue of Supervised Machine Learning wi…☆27Updated 8 months ago
- ☆40Updated 4 years ago
- Dataset and code for directed sentiment analysis in news text.☆16Updated 3 years ago
- A python package to enrich Twitter Data☆74Updated last year
- ☆29Updated last year
- Contextualised Word Representations for Lexical Semantic Change Analysis☆31Updated 4 years ago
- Code for the paper "Content Analysis of Textbooks via Natural Language Processing".☆56Updated last year
- ☆22Updated 3 years ago
- Introducing gpt_annotate: an easy-to-use python package designed to streamline automated text annotation using LLMs for different tasks a…☆28Updated 2 months ago
- Multilingual Emotion classification using BERT (fine-tuning). Published at the WASSA workshop (ACL2022).☆5Updated last year
- Code for the CUP Elements on text analysis in Python for social scientists☆135Updated 2 years ago
- A corpus of comments tagged for multiple attributes of unhealthiness.☆34Updated 3 years ago
- [LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweeban…☆102Updated 9 months ago
- Training Temporal Word Embeddings with a Compass☆64Updated last year
- Driver for LIWC2015 analysis. LIWC2015 dictionary not included.☆16Updated last year
- Python library for extracting quantitative, reproducible metrics of multi-level alignment between speakers in naturalistic language corpo…☆40Updated last week
- Noise-robust de-duplication at scale☆15Updated last year
- Code for the paper "Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings"☆66Updated 2 years ago
- Python Multilingual Ucrel Semantic Analysis System☆30Updated 3 months ago
- A module to compute textual lexical richness (aka lexical diversity).☆92Updated last year
- Tools for training and evaluating word embeddings based on subtitles. Published as "subs2vec: Word embeddings from subtitles in 55 langua…☆33Updated 4 years ago
- ☆22Updated last year
- Code and data for paper "Large language models can rate news outlet credibility"☆12Updated 3 months ago
- ☆31Updated last year
- A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, …☆76Updated 7 months ago
- Additional material for the paper "MoralStrength: Exploiting a Moral Lexicon and Embedding Similarity for Moral Foundations Prediction"☆53Updated last year
- This is a simple Python package for calculating a variety of lexical diversity indices☆65Updated last year