18F / scrapeboxLinks
A simple, system independent infrastructure for performing web scraping. Utilizes Vagrant virtualbox interface and puppet provisioning to create and execute scraping of web content to structured data quickly and easily without modifying your core system.
☆24Updated 11 years ago
Alternatives and similar repositories for scrapebox
Users that are interested in scrapebox are comparing it to the libraries listed below
Sorting:
- ☆36Updated last year
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆47Updated 7 years ago
- A component based data flow framework with a drag-n-drop Web 2.0 interface. Based on Stackless Python and inspired by Yahoo! Pipes.☆150Updated 12 years ago
- Open Source Social Media Monitoring And Engagement System Core/API☆36Updated 11 years ago
- Write you a home page with bookmarks well-organized.☆16Updated 8 years ago
- A tool to graph who has sent you the most emails☆17Updated 8 years ago
- "Hacker-CMS" Sandstorm App mashing up Jekyll, Ace Editor, and jsTree☆67Updated 9 years ago
- A portable, lightweight, locally-hosted IPv4 and IPv6 geolocation API/server☆40Updated 7 years ago
- Various Python scripts to scrape sites that store data about you.☆28Updated 11 years ago
- Specialised bot for periodical grabs and video/audio/etc. webpage scrapes.☆11Updated 7 years ago
- Python library with common functionality for writing web scrapers☆102Updated 10 years ago
- Django feeds provides an extensive database model for RSS feeds and a fault tolerant parser.☆30Updated 13 years ago
- A pastebin for tables.☆34Updated 11 years ago
- A native web-based client for Slack.☆23Updated 8 years ago
- craigslist image processing service☆98Updated 12 years ago
- AES encrypted password manager☆186Updated 10 years ago
- Python module to watch Twitter user pages or search-results.☆63Updated 11 years ago
- A Python SDK for Human + AI Conversational Experiences☆10Updated 8 years ago
- Feedbuffer buffers RSS and Atom syndication feeds, that is to say it caches new feed entries until the news aggregator requests them and …☆19Updated 9 years ago
- Secure random passwords in javascript☆18Updated 5 years ago
- Bringing sanity to world of messed-up data☆66Updated 10 years ago
- A simpler way to manage servers online. Commando.io empowers users to be more efficient, improve their workflow, and eliminate anxiety ov…☆412Updated 10 years ago
- The Python Achievements Framework!☆118Updated 3 years ago
- a simple server that connects calls between citizens and their congress person using the Twilio API☆66Updated 3 years ago
- scraper related helper functions☆27Updated 11 years ago
- Fail2ban web dashboard written with Flask framework. (not maintained)☆126Updated 7 years ago
- We use Tock to track and report our time at 18F☆125Updated this week
- This is a heroku buildpack for Pelican.☆23Updated 3 years ago
- Superfeedr powered pipes!☆131Updated 10 years ago
- framework for scraping legislative/government data☆88Updated last year