18F / scrapeboxLinks
A simple, system independent infrastructure for performing web scraping. Utilizes Vagrant virtualbox interface and puppet provisioning to create and execute scraping of web content to structured data quickly and easily without modifying your core system.
☆24Updated 11 years ago
Alternatives and similar repositories for scrapebox
Users that are interested in scrapebox are comparing it to the libraries listed below
Sorting:
- Twerp is the telephone hackers toolkit. It's also a command-line app for Twilio, written in Python☆27Updated 5 years ago
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆47Updated 7 years ago
- A component based data flow framework with a drag-n-drop Web 2.0 interface. Based on Stackless Python and inspired by Yahoo! Pipes.☆150Updated 13 years ago
- A pastebin for tables.☆34Updated 12 years ago
- ☆36Updated 2 years ago
- "Hacker-CMS" Sandstorm App mashing up Jekyll, Ace Editor, and jsTree☆68Updated 9 years ago
- Write you a home page with bookmarks well-organized.☆16Updated 8 years ago
- Bringing sanity to world of messed-up data☆66Updated 11 years ago
- Python library with common functionality for writing web scrapers☆102Updated 10 years ago
- Keep an eye on specific keywords being posted on Twitter☆46Updated 10 years ago
- a simple server that connects calls between citizens and their congress person using the Twilio API☆66Updated 4 years ago
- A tool to graph who has sent you the most emails☆17Updated 8 years ago
- Python module to watch Twitter user pages or search-results.☆64Updated 11 years ago
- [UNMAINTAINED] Deploy, run and monitor your Scrapy spiders.☆11Updated last week
- Open Source Social Media Monitoring And Engagement System Core/API☆37Updated 11 years ago
- video indexing site☆214Updated 10 years ago
- Python framework for creating and deploying Twitter bots.☆59Updated 5 years ago
- A modern web based communication service on top IRC.☆151Updated 8 years ago
- framework for scraping legislative/government data☆89Updated 2 months ago
- A bootstrap Python application, so that you can focus on writing code☆286Updated 12 years ago
- A set of Python classes that interact with and extend the Keybase.io data store of public keys☆30Updated 6 years ago
- Check out https://github.com/webrecorder/webrecorder for newer version matching https://webrecorder.io☆38Updated 10 years ago
- python boilerplate application following MVC pattern using bottle micro framework☆86Updated 12 years ago
- Secure random passwords in javascript☆18Updated 6 years ago
- craigslist blob service☆92Updated 8 years ago
- Python script for searching through your digital books and cataloguing them in an easy-to-share list of files.☆31Updated 6 years ago
- Python library to extract text from PDF, and default to OCR when text extraction fails.☆62Updated 8 years ago
- Tiny python web crawler☆169Updated 9 years ago
- Mock HTTP server☆69Updated 7 years ago
- Whit is an open source SMS service, which allows you to query CrunchBase, Wikipedia, and several other data APIs.☆198Updated 12 years ago