the-dataface / Newspaper-Scrapers
Scrape article metadata from major media outlet's websites, including NYT, WaPo, WSJ. Built on top of the Newspaper Python Library (https://github.com/codelucas/newspaper).
☆51Updated 7 years ago
Alternatives and similar repositories for Newspaper-Scrapers:
Users that are interested in Newspaper-Scrapers are comparing it to the libraries listed below
- Google News Scraper for languages like Japanese, Chinese... [VPN Support]☆97Updated 3 years ago
- ☆22Updated 4 years ago
- Web scraping Reddit without using Reddit API, and making a dataset, and using the dataset for a machine learning project.☆81Updated 2 years ago
- Scrapers from a project in 2018. Yelp, Spyfu, Similarweb, Morningstar, Linkedin, Instagram, Inside, Glassdoor, Facebook, Eat24, Doordash,…☆95Updated 5 years ago
- Data Science module - text analytics, Natural Language Processing, and Machine Learning on Social Media (twitter) data☆24Updated 5 years ago
- Finco automates the process of generating financial documentation and valuations for companies traded on the NASDAQ and NYSE. Provides us…☆26Updated 9 years ago
- A Google Trends Analytics Package☆13Updated 10 months ago
- Simple metrics on sports betting☆24Updated last year
- Jupyter notebooks for Data Science for Journalism☆15Updated 5 years ago
- Web scraper for indeed job search to reveal the data scientist required skills keywords☆36Updated 8 years ago
- Modelling Big Five Personality Inventory using Machine Learning algorithms☆22Updated 5 months ago
- I'm a curious person and analysing world news is fun. Here I'm gathering all my Gdelt-related projects.☆20Updated 4 years ago
- Downloads news articles from Google news and uses pre-trained NLP models to perform sentiment analysis☆58Updated 3 years ago
- The objective of this project is to scrape a corpus of news articles from a set of web pages, pre-process the corpus, and then to apply u…☆51Updated 7 years ago
- Twitter Trends is a web-based application that automatically detects and analyzes emerging topics in real time through hashtags and user …☆105Updated 7 years ago
- Python scripts to extract text from PDFs, save it as a text file, export a list of words and their frequencies to a CSV file for further …☆35Updated 7 years ago
- This repository includes our work on extracting the digital transformation strategy of Fortune 500 companies from earnings calls transcri…☆28Updated 4 years ago
- Using Machine Learning and Neural Nets to predict NCAA basketball point spreads.☆14Updated 5 years ago
- A multiprocessing webscraper for Coursera.org to build a dataset for all courses with details like ratings, difficulty, etc.☆19Updated 4 years ago
- scrapes names and tickers from magicformulainvesting.com every quarter, adds info to a google sheet which includes stock prices and a lin…☆32Updated last year
- An open source book on Python tailed for communication students with zero background☆118Updated 5 years ago
- Newsfeed based on GDELT Project☆24Updated 11 months ago
- An open interface to GDELT APIs☆47Updated last year
- Fetches Daily Data from Google Trends☆11Updated 5 years ago
- real estate automated valuation model☆35Updated 8 years ago
- Simple Python utility that downloads and extracts SEC financial statement data sets.☆32Updated 7 years ago
- Collect data from Facebook Posts based on search queries 🦋☆41Updated 2 years ago
- Git Repo for Articles on Ergo Sum blog and the youtube channel https://www.youtube.com/channel/UCiie9CN--dazA7iT2sry5FA☆87Updated 3 months ago
- Handy Jupyter Notebooks that I use in for Topic Modeling. Including text mining from PDF files, text preprocessing, Latent Dirichlet Allo…☆42Updated 5 years ago
- Use Google Flights API and scrape Expedia to find the cheapest/shortest flights!☆24Updated 7 years ago