umbrae / reddit-top-2.5-millionLinks
This is a dataset of the all-time top 1,000 posts, from the top 2,500 subreddits by subscribers, pulled from reddit between August 15–20, 2013.
☆621Updated 5 years ago
Alternatives and similar repositories for reddit-top-2.5-million
Users that are interested in reddit-top-2.5-million are comparing it to the libraries listed below
Sorting:
- A collection of reddit bots and utilities☆492Updated 11 months ago
- A Python script that parses post titles, self-texts, and comments on reddit and makes word clouds out of the word frequencies.☆289Updated 2 years ago
- Simple ML experiment to classify article titles as clickbait or news.☆117Updated 2 years ago
- A python library for simple text summarization☆219Updated 10 years ago
- SnoopSnoo — reddit user and subreddits analytics☆89Updated 8 years ago
- An automated subreddit with posts created using markov chains☆468Updated 9 years ago
- A python script for summarizing articles using nltk☆545Updated 8 years ago
- Python library to convert textual numbers to integers☆60Updated 8 years ago
- Scrapes public information off of LinkedIn☆110Updated 9 years ago
- Data from the last ten years of reddit☆45Updated 10 years ago
- will try to make interesting reddit crawlers that give some insight☆380Updated 8 years ago
- The Zipru scraper developed in the Advanced Web Scraping Tutorial.☆429Updated 8 years ago
- Simple Python scripts to download all Hacker News submissions and comments and store them in a PostgreSQL database.☆123Updated 7 years ago
- Here's what you sound like...☆133Updated 2 years ago
- Detects clickbait headlines using deep learning.☆465Updated 5 years ago
- Rewriting web proxy and archival tool. At this point, it just tries to download all the things.☆203Updated 2 weeks ago
- Lots and lots of web scrapers☆181Updated 3 years ago
- Import tables from any Wikipedia article as a dataset in Python☆291Updated 3 years ago
- Political Speech Generator☆348Updated 9 years ago
- All stories and comments posted on Hacker News upto May 29, 2014☆129Updated 6 years ago
- Scrapy examples crawling Craigslist☆199Updated 9 years ago
- Code + Jupyter notebook for analyzing and visualizing Reddit Data quickly and easily☆111Updated 9 years ago
- Hacker News plus topic tags. TechCrunch Disrupt NY Hackathon 2017☆123Updated 6 years ago
- A Python module to extract personality insights, sentiment & keywords from reddit accounts. pip install reddit_persona☆26Updated 7 years ago
- Extract user info from their reddit comments and activity.☆68Updated last year
- Provides content not accessible through the standard Amazon API☆234Updated 7 years ago
- Automatic Web Article Summarizer☆417Updated 3 years ago
- OKCupid profile datasets, code to scrape okcupid, and code to compute reading level of text☆67Updated 9 years ago
- Automatic text summarization☆244Updated 6 years ago
- Automatic keyword extraction - no alchemy required!☆169Updated 9 years ago