umbrae / reddit-top-2.5-million
This is a dataset of the all-time top 1,000 posts, from the top 2,500 subreddits by subscribers, pulled from reddit between August 15–20, 2013.
☆617Updated 4 years ago
Alternatives and similar repositories for reddit-top-2.5-million:
Users that are interested in reddit-top-2.5-million are comparing it to the libraries listed below
- Tools to work with the big reddit JSON data dump.☆251Updated 8 months ago
- 1mb Archive of Donald Trump Speeches☆180Updated 8 years ago
- A Python script that parses post titles, self-texts, and comments on reddit and makes word clouds out of the word frequencies.☆286Updated last year
- A collection of reddit bots and utilities☆488Updated 8 months ago
- The reddit Data Extractor is a cross-platform GUI tool for downloading almost any content posted to reddit. Downloads from specific users…☆234Updated 2 months ago
- Automatic Web Article Summarizer☆415Updated 3 years ago
- 2015 CrunchBase Data Export as CSV☆160Updated 9 years ago
- Fact Extraction from Wikipedia Text☆531Updated 8 years ago
- An automated subreddit with posts created using markov chains☆468Updated 9 years ago
- Extract user info from their reddit comments and activity.☆65Updated last year
- Data from the last ten years of reddit☆45Updated 9 years ago
- A python script for summarizing articles using nltk☆544Updated 8 years ago
- OKCupid profile datasets, code to scrape okcupid, and code to compute reading level of text☆67Updated 8 years ago
- Code + Jupyter notebook for analyzing and visualizing Reddit Data quickly and easily☆112Updated 9 years ago
- Simple Python scripts to download all Hacker News submissions and comments and store them in a PostgreSQL database.☆120Updated 7 years ago
- Summarizes news articles☆1,167Updated 3 years ago
- A Python project inspired by the research of Chloé Kiddon and Yuriy Brun. Part of the Funniest Computer Ever Open Source initiative☆57Updated 6 years ago
- Here's what you sound like...☆132Updated 2 years ago
- Python library to convert textual numbers to integers☆60Updated 8 years ago
- will try to make interesting reddit crawlers that give some insight☆380Updated 8 years ago
- SnoopSnoo — reddit user and subreddits analytics☆87Updated 7 years ago
- A Python module to extract personality insights, sentiment & keywords from reddit accounts. pip install reddit_persona☆26Updated 7 years ago
- A large corpus of discourse annotations and relations on ~10K forum threads.☆238Updated 6 years ago
- Extract data from websites using basic statistical magic