dominiksinsaarland / DocSCANLinks
Learning from Neighbors: Unsupervised Text Classification
☆17Updated 3 years ago
Alternatives and similar repositories for DocSCAN
Users that are interested in DocSCAN are comparing it to the libraries listed below
Sorting:
- Project repository of the paper "Less Annotating, More Classifying – Addressing the Data Scarcity Issue of Supervised Machine Learning wi…☆34Updated last year
- Repository for the paper Us vs. Them: A Dataset of Populist Attitudes, News Bias and Emotions☆17Updated last year
- Package to extract connotation frames☆91Updated last year
- Code for the paper "Content Analysis of Textbooks via Natural Language Processing".☆62Updated 2 years ago
- ☆41Updated 5 years ago
- HDBSCAN Tuning for BERTopic Models☆49Updated 2 years ago
- Text-Based Ideal Points☆46Updated 2 years ago
- ConfliBERT: A Pre-trained Language Model for Political Conflict and Violence (NAACL 2022)☆41Updated last week
- Code for the CUP Elements on text analysis in Python for social scientists☆138Updated 3 years ago
- Twitter dataset for 2022 Russian and Ukrainian crisis☆48Updated 3 years ago
- GisPy: A Tool for Measuring Gist Inference Score in Text https://aclanthology.org/2022.wnu-1.5/☆13Updated last year
- Code and data for paper "Large language models can rate news outlet credibility"☆13Updated last year
- A python package to enrich Twitter Data☆75Updated 2 years ago
- A module to compute textual lexical richness (aka lexical diversity).☆110Updated 2 years ago
- ☆15Updated 7 years ago
- TimeLMs: Diachronic Language Models from Twitter☆111Updated last year
- Introducing gpt_annotate: an easy-to-use python package designed to streamline automated text annotation using LLMs for different tasks a…☆29Updated last year
- Dutch abusive language data☆11Updated 2 years ago
- A tool for Semantic Scaling of Political Text (branch of Topfish, a suite of tools for Political Text Analysis)☆28Updated 4 months ago
- ☆172Updated 2 years ago
- This is the data associated with the PERSUADE Corpus 2.0 version☆46Updated last year
- Repository for the CommonLit Ease of Readability Corpus☆24Updated last year
- ☆36Updated 10 months ago
- ☆54Updated 5 months ago
- Python library for extracting quantitative, reproducible metrics of multi-level alignment between speakers in naturalistic language corpo…☆53Updated 7 months ago
- ☆40Updated 4 years ago
- Additional material for the paper "MoralStrength: Exploiting a Moral Lexicon and Embedding Similarity for Moral Foundations Prediction"☆55Updated 4 months ago
- Concept Modeling: Topic Modeling on Images and Text☆215Updated last year
- Tools to train and explore diachronic word embeddings from Big Historical Data☆28Updated 9 months ago
- A novel dataset containing over 15 Million COVID-19 vaccine-related tweets and 15 Thousand labeled tweet for vaccine misinformation detec…☆33Updated last year