18F / scrapeboxLinks
A simple, system independent infrastructure for performing web scraping. Utilizes Vagrant virtualbox interface and puppet provisioning to create and execute scraping of web content to structured data quickly and easily without modifying your core system.
☆24Updated 11 years ago
Alternatives and similar repositories for scrapebox
Users that are interested in scrapebox are comparing it to the libraries listed below
Sorting:
- Twerp is the telephone hackers toolkit. It's also a command-line app for Twilio, written in Python☆27Updated 5 years ago
- A pastebin for tables.☆34Updated 12 years ago
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆48Updated 7 years ago
- ☆36Updated 2 years ago
- Django feeds provides an extensive database model for RSS feeds and a fault tolerant parser.☆30Updated 13 years ago
- A tool to graph who has sent you the most emails☆17Updated 8 years ago
- The Python Achievements Framework!☆118Updated 4 years ago
- Bringing sanity to world of messed-up data☆66Updated 11 years ago
- Python scripts for scraping bus ticket data from the websites of BoltBus, Greyhound, Megabus, GoBus, Amtrak, Peterpan, and EasternTravel.☆39Updated 5 years ago
- Python library with common functionality for writing web scrapers☆102Updated 10 years ago
- Feedbuffer buffers RSS and Atom syndication feeds, that is to say it caches new feed entries until the news aggregator requests them and …☆19Updated 9 years ago
- A component based data flow framework with a drag-n-drop Web 2.0 interface. Based on Stackless Python and inspired by Yahoo! Pipes.☆150Updated 13 years ago
- Open Source Social Media Monitoring And Engagement System Core/API☆37Updated 11 years ago
- a simple server that connects calls between citizens and their congress person using the Twilio API☆66Updated 4 years ago
- A scrapy extension to store requests and responses information in storage service☆27Updated 3 years ago
- Sample applications that cover common use cases in a variety of languages.☆18Updated 14 years ago
- Python script for searching through your digital books and cataloguing them in an easy-to-share list of files.☆31Updated 6 years ago
- A native web-based client for Slack.☆23Updated 8 years ago
- A collaborative list of open-source alternatives to typical government and enterprise software needs☆47Updated 9 years ago
- [UNMAINTAINED] Deploy, run and monitor your Scrapy spiders.☆11Updated 2 weeks ago
- legacy backend for Open States☆87Updated 6 years ago
- Export a graph of link between crawled items by scrapy in dot file format.☆26Updated 14 years ago
- framework for scraping legislative/government data☆89Updated 2 months ago
- A starter app template for Pinax apps☆38Updated 5 years ago
- sync a website or local spreadsheet with a google sheet☆35Updated 3 years ago
- A Python client for Chrome's DevTools protocol / a headless chrome control library☆15Updated 7 years ago
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆25Updated 9 years ago
- Exporters is an extensible export pipeline library that supports filter, transform and several sources and destinations☆40Updated last year
- Best practices setup for large webapps, apis and cli applications with flask☆13Updated 3 years ago
- Python interface to Digital Ocean☆24Updated 10 years ago