fedecalendino / reddit-graph-releases
Releases for the reddit-graph project
☆18Updated 7 months ago
Alternatives and similar repositories for reddit-graph-releases:
Users that are interested in reddit-graph-releases are comparing it to the libraries listed below
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆19Updated last year
- Tag news stories based on models trained on the NYT corpus.☆42Updated last year
- Download subreddit comments☆93Updated 2 years ago
- Read compressed NDJSON .zst files easily☆32Updated 2 years ago
- A financial disclosure data extraction tool.☆13Updated last year
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆37Updated 5 years ago
- A simple web application for searching Word2Vec embeddings derived from approximately 2,000 law reports published by the The Incorporated…☆26Updated 2 years ago
- an experimental implementation of Burrow's delta in Python 3☆20Updated 3 years ago
- Extracts key terminology (n-grams) from any large collection of documents (>1000) and forecasts emergence☆62Updated last year
- Python utility to archive and keep up-to-date archives of reddit subreddits. Archives to SQLite databases.☆29Updated last year
- NSS Capstone project to use natural language modeling, classification, and information extraction to get the exact employee count values …☆15Updated 6 years ago
- Package for performing Reddit-based text analysis☆20Updated 6 years ago
- Crawl sites for RSS, Atom, and JSON feeds.☆69Updated 8 months ago
- Cleans Reddit Text Data☆81Updated 4 years ago
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆25Updated 4 years ago
- A Github actions based web scraper, which publishes a CSV file containing domestic box office statistics each night. Downloads are availa…☆16Updated last month
- ☆125Updated last year
- An open interface to GDELT APIs☆45Updated last year
- Ethical, legal, and effortless extraction of Reddit data in your database☆64Updated 4 months ago
- A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.☆215Updated last year
- Materials to reproduce our findings in our stories, "Amazon Puts Its Own 'Brands' First Above Better-Rated Products" and "When Amazon Tak…☆68Updated 3 years ago
- Simple job postings scraper for Indeed based on requests and BeautifulSoup☆14Updated 3 years ago
- Convert text-intensive ICEWS data on Dataverse to conventional ISO-3166 and CAMEO codes☆14Updated 4 years ago
- Stylometry library for Burrows' Delta method☆34Updated 9 months ago
- Cleaning tool for web scraped text☆39Updated last year
- Memes Processing Pipeline that enables the track of memes across multiple Web communities.☆57Updated 4 years ago
- Random programs for reddit☆18Updated 5 years ago
- Regex like pattern tree matching but on sentence's tree instead of Strings☆42Updated 6 years ago
- Scalable String Similarity Joins in Python☆38Updated 7 months ago