dominiksinsaarland / DocSCANLinks
Learning from Neighbors: Unsupervised Text Classification
☆17Updated 2 years ago
Alternatives and similar repositories for DocSCAN
Users that are interested in DocSCAN are comparing it to the libraries listed below
Sorting:
- Project repository of the paper "Less Annotating, More Classifying – Addressing the Data Scarcity Issue of Supervised Machine Learning wi…☆32Updated last year
- Repository for the paper Us vs. Them: A Dataset of Populist Attitudes, News Bias and Emotions☆17Updated last year
- Introducing gpt_annotate: an easy-to-use python package designed to streamline automated text annotation using LLMs for different tasks a…☆28Updated 11 months ago
- ConfliBERT: A Pre-trained Language Model for Political Conflict and Violence (NAACL 2022)☆34Updated 3 months ago
- Code and data for paper "Large language models can rate news outlet credibility"☆13Updated 11 months ago
- A python package to enrich Twitter Data☆75Updated 2 years ago
- ☆33Updated 2 years ago
- Twitter dataset for 2022 Russian and Ukrainian crisis☆48Updated 2 years ago
- ☆42Updated 5 years ago
- ☆50Updated 2 months ago
- Code for the paper "Content Analysis of Textbooks via Natural Language Processing".☆59Updated 2 years ago
- ☆35Updated 7 months ago
- This is a step by step tutorial for text analyst who want an easy start to basic and and common techniques in NLP, Text Analysis, Machine…☆20Updated 2 years ago
- HDBSCAN Tuning for BERTopic Models☆48Updated 2 years ago
- Text-Based Ideal Points☆45Updated 2 years ago
- Code for the CUP Elements on text analysis in Python for social scientists☆137Updated 2 years ago
- Package to extract connotation frames☆86Updated last year
- TimeLMs: Diachronic Language Models from Twitter☆109Updated last year
- CausalNLP is a practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable.☆151Updated 6 months ago
- Dutch abusive language data☆11Updated last year
- Noise-robust de-duplication at scale☆20Updated 2 years ago
- Concept Modeling: Topic Modeling on Images and Text☆213Updated 9 months ago
- The Harvard USPTO Patent Dataset☆69Updated last year
- Course repository for the session "Hands-on Transformers: Fine-Tune your own BERT and GPT" of the Data Science Summer School 2023☆88Updated last year
- A novel dataset containing over 15 Million COVID-19 vaccine-related tweets and 15 Thousand labeled tweet for vaccine misinformation detec…☆32Updated 11 months ago
- This is the data associated with the PERSUADE Corpus 2.0 version☆43Updated 8 months ago
- AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/☆47Updated last year
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆118Updated last year
- [LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweeban…☆105Updated last year
- ☆15Updated 7 years ago