18F / scrapeboxLinks
A simple, system independent infrastructure for performing web scraping. Utilizes Vagrant virtualbox interface and puppet provisioning to create and execute scraping of web content to structured data quickly and easily without modifying your core system.
☆24Updated 11 years ago
Alternatives and similar repositories for scrapebox
Users that are interested in scrapebox are comparing it to the libraries listed below
Sorting:
- Twerp is the telephone hackers toolkit. It's also a command-line app for Twilio, written in Python☆26Updated 4 years ago
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆47Updated 7 years ago
- A component based data flow framework with a drag-n-drop Web 2.0 interface. Based on Stackless Python and inspired by Yahoo! Pipes.☆150Updated 12 years ago
- A tool to graph who has sent you the most emails☆17Updated 8 years ago
- Bringing sanity to world of messed-up data☆66Updated 10 years ago
- ☆36Updated last year
- Specialised bot for periodical grabs and video/audio/etc. webpage scrapes.☆11Updated 7 years ago
- Python library with common functionality for writing web scrapers☆102Updated 10 years ago
- framework for scraping legislative/government data☆86Updated 10 months ago
- AES encrypted password manager☆186Updated 10 years ago
- Open Source Social Media Monitoring And Engagement System Core/API☆36Updated 10 years ago
- Write you a home page with bookmarks well-organized.☆16Updated 7 years ago
- "Hacker-CMS" Sandstorm App mashing up Jekyll, Ace Editor, and jsTree☆67Updated 9 years ago
- A native web-based client for Slack.☆23Updated 7 years ago
- ScraperWiki Python library for scraping and saving data☆159Updated 2 years ago
- scraper related helper functions☆27Updated 11 years ago
- The Python Achievements Framework!☆118Updated 3 years ago
- A modern web based communication service on top IRC.☆152Updated 8 years ago
- Junk drawer of old scripts.☆18Updated 9 years ago
- A photobooth script that automatically snaps a photo, applies a watermark, uploads to a remote server, generates a QRCode, shortens the U…☆69Updated 9 years ago
- A portable, lightweight, locally-hosted IPv4 and IPv6 geolocation API/server☆40Updated 7 years ago
- Python interface to Digital Ocean☆24Updated 10 years ago
- Python module to watch Twitter user pages or search-results.☆63Updated 11 years ago
- My Privoxy configuration files☆27Updated 13 years ago
- Tiny python web crawler☆169Updated 9 years ago
- Command-line tool for interacting with Pancake.☆14Updated 11 years ago
- Secure random passwords in javascript☆18Updated 5 years ago
- An automatic issue creator for Github, written in Python. It'll go through your entire git repo, look for any lines that start with TODO:…☆83Updated 7 years ago
- Python scripts for scraping bus ticket data from the websites of BoltBus, Greyhound, Megabus, GoBus, Amtrak, Peterpan, and EasternTravel.☆38Updated 4 years ago
- This is a heroku buildpack for Pelican.☆23Updated 3 years ago