18F / scrapeboxLinks
A simple, system independent infrastructure for performing web scraping. Utilizes Vagrant virtualbox interface and puppet provisioning to create and execute scraping of web content to structured data quickly and easily without modifying your core system.
☆24Updated 10 years ago
Alternatives and similar repositories for scrapebox
Users that are interested in scrapebox are comparing it to the libraries listed below
Sorting:
- An example REST API with Django, Tastypie, xAuth and Heroku☆72Updated 5 years ago
- Python command line tools, for increased fu.☆46Updated 9 years ago
- Specialised bot for periodical grabs and video/audio/etc. webpage scrapes.☆11Updated 7 years ago
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆46Updated 7 years ago
- Write you a home page with bookmarks well-organized.☆16Updated 7 years ago
- A more liberal autolink extension for python Markdown☆20Updated 2 years ago
- Very simple Netflix API client☆24Updated 14 years ago
- Google Cloud Datastore storage module for Botkit☆14Updated 11 months ago
- ☆48Updated 5 years ago
- Twerp is the telephone hackers toolkit. It's also a command-line app for Twilio, written in Python☆26Updated 4 years ago
- Proxy-list management application for Django☆23Updated 7 years ago
- Open Source Social Media Monitoring And Engagement System Core/API☆36Updated 10 years ago
- Simple to use python library for Buffer App☆23Updated 2 years ago
- Export a graph of link between crawled items by scrapy in dot file format.☆26Updated 13 years ago
- video indexing site☆217Updated 9 years ago
- giving git more tentacles☆577Updated 6 years ago
- Surlex (Simple URL expression translator) - Language for URL matching and extraction☆71Updated 11 years ago
- Django async media encoding☆9Updated 8 years ago
- Whit is an open source SMS service, which allows you to query CrunchBase, Wikipedia, and several other data APIs.☆198Updated 12 years ago
- This is a heroku buildpack for Pelican.☆23Updated 3 years ago
- Webhooks for Django *experimental*☆62Updated 15 years ago
- It is a plugin to pyexcel and provides the capability to present and write data in text formats using tabulate☆11Updated 7 years ago
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆24Updated 8 years ago
- Python library with common functionality for writing web scrapers☆102Updated 9 years ago
- Stuff I use on Linux.☆30Updated 10 years ago
- [UNMAINTAINED] Deploy, run and monitor your Scrapy spiders.☆11Updated 10 years ago
- A python auto-scaffolding tool for MVC applications like Django.☆18Updated 9 years ago
- A small debugging helper for Flask.☆10Updated 9 years ago
- Bringing sanity to world of messed-up data☆66Updated 10 years ago
- Feedbuffer buffers RSS and Atom syndication feeds, that is to say it caches new feed entries until the news aggregator requests them and …☆19Updated 8 years ago