liberit / scraptilsLinks
scraper related helper functions
☆27Updated 10 years ago
Alternatives and similar repositories for scraptils
Users that are interested in scraptils are comparing it to the libraries listed below
Sorting:
- Junk drawer of old scripts.☆18Updated 9 years ago
- Smart progressbar with multiple backends supporting both explicit updating and tqdm-style iterable-wrapping☆10Updated 8 years ago
- A library for extracting tables from PDF files☆89Updated 11 years ago
- Document Imaging Archive System. Home document imaging, with OCR. Scan documents (with SANE) or import ODF documents, assign tags. Use op…☆25Updated 9 years ago
- Automated NLP sentiment predictions- batteries included, or use your own data☆18Updated 7 years ago
- An eBook tool to extract ISBN or Metadata form eBook and rename them by using ISBN database and Metadata☆30Updated 9 years ago
- Python library and command line tool for converting data from one format to another☆99Updated 5 years ago
- A small python script for easy access to firefox bookmarks and browsing history☆22Updated 5 years ago
- A Python client for the GoodReads API☆36Updated 11 years ago
- This is a side project from 2008. This package contains a tool for automatically cropping and deskewing images of book pages captured by …☆28Updated 12 years ago
- ScraperWiki Python library for scraping and saving data☆159Updated 2 years ago
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆46Updated 7 years ago
- Serapis is a sentence identifier and modeling pipeline / built for Wordnik☆24Updated 9 years ago
- Convert cron emails to RSS 2.0. It's the least you can do.☆14Updated 10 years ago
- Automatic, zero-config web scraping -- written in Java, has no dependency on Java EE or app servers, and the web scraper has a restful/JS…☆155Updated 7 years ago
- Sample Python connector to the Gnip streaming services☆13Updated 10 years ago
- The web design and pages for my personal website.☆11Updated 4 years ago
- This is a heroku buildpack for Pelican.☆23Updated 3 years ago
- The Pyramid version of the todo app for the Python Web Shootout☆72Updated 2 years ago
- Pure python script that takes user query and summarizes news related to it.☆25Updated 2 years ago
- Open Source Social Media Monitoring And Engagement System Core/API☆36Updated 10 years ago
- ClickScript is a visual programming language, a data flow programming language running entirely in a web browser.☆63Updated 12 years ago
- A component based data flow framework with a drag-n-drop Web 2.0 interface. Based on Stackless Python and inspired by Yahoo! Pipes.☆150Updated 12 years ago
- This is a bot to download all your instagram gallery pictures in a single folder☆58Updated 9 years ago
- Specialised bot for periodical grabs and video/audio/etc. webpage scrapes.☆11Updated 7 years ago
- Vidscraper is a python library which provides a simple API for fetching video data from various web services and sites.☆62Updated 2 years ago
- Python library with common functionality for writing web scrapers☆102Updated 9 years ago
- Short script for removing watermarks from PDF files. Requires pdftk.☆59Updated 6 years ago
- Search engine for subtitles☆10Updated 10 years ago
- Demo of the Newspaper article extraction library.☆29Updated 10 years ago