18F / scrapebox
A simple, system independent infrastructure for performing web scraping. Utilizes Vagrant virtualbox interface and puppet provisioning to create and execute scraping of web content to structured data quickly and easily without modifying your core system.
☆24Updated 10 years ago
Alternatives and similar repositories for scrapebox:
Users that are interested in scrapebox are comparing it to the libraries listed below
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆46Updated 7 years ago
- Python command line tools, for increased fu.☆46Updated 9 years ago
- This is a heroku buildpack for Pelican.☆23Updated 3 years ago
- A tool to graph who has sent you the most emails☆18Updated 8 years ago
- Django feeds provides an extensive database model for RSS feeds and a fault tolerant parser.☆31Updated 12 years ago
- Open Source Social Media Monitoring And Engagement System Core/API☆36Updated 10 years ago
- A native web-based client for Slack.☆23Updated 7 years ago
- ☆35Updated last year
- Secure random passwords in javascript☆18Updated 5 years ago
- Proxy-list management application for Django☆23Updated 7 years ago
- Python library with common functionality for writing web scrapers☆102Updated 9 years ago
- A Python version (almost a port) of ProPublica's TableFu☆231Updated 11 years ago
- A Grooveshark song downloader in Python☆120Updated 8 years ago
- foauth.org makes OAuth optional.☆184Updated 8 years ago
- Mailchimp signup utility based on flask + aws lambda/s3☆12Updated 9 years ago
- A UI for docker-machine☆13Updated 8 years ago
- Specialised bot for periodical grabs and video/audio/etc. webpage scrapes.☆11Updated 7 years ago
- Balanced API library in python.☆69Updated 3 years ago
- video indexing site☆217Updated 9 years ago
- bookmark management for the Django web framework☆17Updated 9 years ago
- Pypo is a self hosted bookmarking service like Pocket, implemented in Python with django☆29Updated 9 years ago
- Removes Google Analytics-related utm_ parameters from displayed URLs☆28Updated 9 years ago
- Bringing sanity to world of messed-up data☆66Updated 10 years ago
- scraper related helper functions☆27Updated 10 years ago
- Surlex (Simple URL expression translator) - Language for URL matching and extraction☆71Updated 11 years ago
- Junk drawer of old scripts.☆18Updated 9 years ago
- django buddy, a chat bot use django as server, python aiml as backend.☆20Updated 11 years ago
- The code for the librelist.com free email service.☆73Updated 10 years ago
- Twerp is the telephone hackers toolkit. It's also a command-line app for Twilio, written in Python☆26Updated 4 years ago
- An example REST API with Django, Tastypie, xAuth and Heroku☆72Updated 5 years ago