liberit / scraptilsLinks
scraper related helper functions
☆27Updated 11 years ago
Alternatives and similar repositories for scraptils
Users that are interested in scraptils are comparing it to the libraries listed below
Sorting:
- A component based data flow framework with a drag-n-drop Web 2.0 interface. Based on Stackless Python and inspired by Yahoo! Pipes.☆150Updated 13 years ago
- ScraperWiki Python library for scraping and saving data; in maintenance mode☆158Updated last week
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆47Updated 7 years ago
- A collaborative list of open-source alternatives to typical government and enterprise software needs☆47Updated 9 years ago
- Sample Python connector to the Gnip streaming services☆13Updated 11 years ago
- Grabbing all news.☆62Updated 6 years ago
- ClickScript is a visual programming language, a data flow programming language running entirely in a web browser.☆63Updated 13 years ago
- Superfeedr powered pipes!☆131Updated 10 years ago
- A javascript tool to visualize the diff's in wikipedia☆35Updated 3 years ago
- Automated NLP sentiment predictions- batteries included, or use your own data☆18Updated 8 years ago
- A bash only script to deploy $HOME dot files accrossing different hosts.☆87Updated 3 years ago
- Demo of the Newspaper article extraction library.☆29Updated 11 years ago
- Specialised bot for periodical grabs and video/audio/etc. webpage scrapes.☆11Updated 8 years ago
- A library for extracting tables from PDF files☆89Updated 12 years ago
- Junk drawer of old scripts.☆18Updated 9 years ago
- A simple, system independent infrastructure for performing web scraping. Utilizes Vagrant virtualbox interface and puppet provisioning to…☆24Updated 11 years ago
- An online sentiment analyzer built with Flask and TextBlob☆15Updated 12 years ago
- Aviation grade news article metadata extraction☆36Updated 2 years ago
- I'm Leselys, your very elegant RSS reader.☆226Updated 5 years ago
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆25Updated 9 years ago
- A small python script for easy access to firefox bookmarks and browsing history☆23Updated 5 years ago
- IPython Notebook Cookbook for Deployment via Chef☆41Updated 8 years ago
- Serapis is a sentence identifier and modeling pipeline / built for Wordnik☆24Updated 9 years ago
- Short script for removing watermarks from PDF files. Requires pdftk.☆59Updated 6 years ago
- This is a heroku buildpack for Pelican.☆23Updated 3 years ago
- Small set of utilities to simplify writing Scrapy spiders.☆49Updated 10 years ago
- Python natural language processing work☆29Updated 16 years ago
- Document Imaging Archive System. Home document imaging, with OCR. Scan documents (with SANE) or import ODF documents, assign tags. Use op…☆25Updated 10 years ago
- Archive.org OPDS Bookserver - A standard for digital book distribution☆130Updated 7 years ago
- Reduction is a python script which automatically summarizes a text by extracting the sentences which are deemed to be most important.☆54Updated 10 years ago