18F / scrapebox
A simple, system independent infrastructure for performing web scraping. Utilizes Vagrant virtualbox interface and puppet provisioning to create and execute scraping of web content to structured data quickly and easily without modifying your core system.
☆24Updated 10 years ago
Alternatives and similar repositories for scrapebox:
Users that are interested in scrapebox are comparing it to the libraries listed below
- A pastebin for tables.☆34Updated 11 years ago
- A WayBack Machine Time-Lapse Generator☆29Updated 6 years ago
- Open Source Social Media Monitoring And Engagement System Core/API☆36Updated 10 years ago
- Twerp is the telephone hackers toolkit. It's also a command-line app for Twilio, written in Python☆26Updated 4 years ago
- This is a heroku buildpack for Pelican.☆23Updated 2 years ago
- Python command line tools, for increased fu.☆46Updated 9 years ago
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆46Updated 7 years ago
- Specialised bot for periodical grabs and video/audio/etc. webpage scrapes.☆11Updated 7 years ago
- An autoscaling python script for Heroku☆27Updated 12 years ago
- Scraper built with Scrapy.☆17Updated 8 months ago
- A weather monitoring Dashboard built upon Python and Yahoo API☆14Updated 9 years ago
- A Python version (almost a port) of ProPublica's TableFu☆231Updated 11 years ago
- ☆35Updated last year
- Serapis is a sentence identifier and modeling pipeline / built for Wordnik☆24Updated 8 years ago
- Surlex (Simple URL expression translator) - Language for URL matching and extraction☆71Updated 11 years ago
- Django feeds provides an extensive database model for RSS feeds and a fault tolerant parser.☆31Updated 12 years ago
- Bringing sanity to world of messed-up data☆66Updated 10 years ago
- A more liberal autolink extension for python Markdown☆20Updated 2 years ago
- Very simple Netflix API client☆24Updated 14 years ago
- Junk drawer of old scripts.☆18Updated 8 years ago
- An example REST API with Django, Tastypie, xAuth and Heroku☆72Updated 5 years ago
- Sample applications that cover common use cases in a variety of languages.☆18Updated 13 years ago
- ☆48Updated 5 years ago
- This project is no longer maintained. Check out https://timesheet.gregbrown.co - the time tracking application which grew out of this cod…☆20Updated 5 years ago
- Python library with common functionality for writing web scrapers☆102Updated 9 years ago
- Proxy-list management application for Django☆23Updated 7 years ago
- Removes Google Analytics-related utm_ parameters from displayed URLs☆28Updated 8 years ago
- In browser VNC client through websockets☆30Updated 13 years ago
- Convert cron emails to RSS 2.0. It's the least you can do.☆14Updated 9 years ago
- small web parser that gets all the top jobs and visualizes the various salaries for each position☆21Updated 9 years ago