socius-org / RedditHarbor
Ethical, legal, and effortless extraction of Reddit data in your database
☆65Updated 5 months ago
Alternatives and similar repositories for RedditHarbor:
Users that are interested in RedditHarbor are comparing it to the libraries listed below
- Example scripts for the pushshift dump files☆345Updated 2 weeks ago
- Making Reddit data accessible to researchers, moderators and everyone else. Interact with the data through large dumps, an API or web in…☆361Updated 3 weeks ago
- Target-dependent sentiment classification in news articles reporting on political events. Includes a high-quality data set of over 11k se…☆148Updated last year
- Newsfeed based on GDELT Project☆23Updated 10 months ago
- HDBSCAN Tuning for BERTopic Models☆45Updated last year
- A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.☆217Updated last year
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆166Updated 9 months ago
- QualiGPT: An easy-to-use tool for qualitative research☆26Updated 5 months ago
- Zero/few shot learning components for scikit-learn pipelines with LLMs and transformers.☆15Updated 4 months ago
- Tools for interactive visual exploration of semantic embeddings.☆32Updated 6 months ago
- Cleans Reddit Text Data☆81Updated 4 years ago
- A collection of network-related python utilities.☆16Updated last year
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆118Updated 11 months ago
- Pushshift Telegram Ingest☆86Updated 5 years ago
- Code for measuring novelty in science using publication text☆25Updated 3 weeks ago
- Given a job title and job description, the algorithm assigns a standard occupational classification (SOC) code to the job.☆73Updated 9 months ago
- Package to extract connotation frames☆83Updated last year
- Tools for conducting and parsing web search☆39Updated last week
- TweetNLP for all the NLP enthusiasts working on Twitter! The Python library tweetnlp provides a collection of useful tools to analyze/und…☆336Updated 7 months ago
- Download subreddit comments☆94Updated 3 years ago
- A web scraper for TikTok using Playwright☆77Updated this week
- Fast, flexible extraction of moral information from textual input data.☆107Updated last year
- NLP tool to extract dimensions of social exchange from textual conversations☆10Updated last year
- Tag news stories based on models trained on the NYT corpus.☆42Updated 2 years ago
- A set of jupyter notebooks demonstrating how to use the Media Cloud API.☆37Updated last year
- Concept Modeling: Topic Modeling on Images and Text☆205Updated 4 months ago
- The Python toolkit for converting Reddit threads into organized text data. Extract and process Reddit content with ease!☆94Updated 7 months ago
- A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning☆118Updated last month
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆19Updated last year
- A multi-modal Twitter dataset with 7.6M tweets and 25.6M retweets related to voter fraud claims.☆53Updated 3 years ago