liberit / scraptilsLinks
scraper related helper functions
☆27Updated 11 years ago
Alternatives and similar repositories for scraptils
Users that are interested in scraptils are comparing it to the libraries listed below
Sorting:
- Python library and command line tool for converting data from one format to another☆99Updated 5 years ago
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆47Updated 7 years ago
- A component based data flow framework with a drag-n-drop Web 2.0 interface. Based on Stackless Python and inspired by Yahoo! Pipes.☆150Updated 12 years ago
- Demo of the Newspaper article extraction library.☆29Updated 10 years ago
- A library for extracting tables from PDF files☆89Updated 11 years ago
- ScraperWiki Python library for scraping and saving data☆158Updated 2 years ago
- An eBook tool to extract ISBN or Metadata form eBook and rename them by using ISBN database and Metadata☆30Updated 10 years ago
- Automated NLP sentiment predictions- batteries included, or use your own data☆18Updated 7 years ago
- Convert cron emails to RSS 2.0. It's the least you can do.☆14Updated 10 years ago
- Junk drawer of old scripts.☆18Updated 9 years ago
- Short script for removing watermarks from PDF files. Requires pdftk.☆59Updated 6 years ago
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆24Updated 9 years ago
- Python scripts for scraping bus ticket data from the websites of BoltBus, Greyhound, Megabus, GoBus, Amtrak, Peterpan, and EasternTravel.☆38Updated 4 years ago
- Sample Python connector to the Gnip streaming services☆13Updated 10 years ago
- An online sentiment analyzer built with Flask and TextBlob☆15Updated 12 years ago
- Grabbing all news.☆62Updated 5 years ago
- Hash-based password manager☆19Updated 6 years ago
- Open Source Social Media Monitoring And Engagement System Core/API☆36Updated 11 years ago
- A simple, system independent infrastructure for performing web scraping. Utilizes Vagrant virtualbox interface and puppet provisioning to…☆24Updated 11 years ago
- A curated list of awesome Jupyter and IPython links☆29Updated 7 years ago
- ☆36Updated last year
- Feed discovery to share :)☆41Updated 8 years ago
- A Python script to download all your mail from Gmail to your local hard drive.☆138Updated 4 months ago
- I'm Leselys, your very elegant RSS reader.☆226Updated 4 years ago
- Bringing sanity to world of messed-up data☆66Updated 10 years ago
- Intelligent RSS news aggregator.☆33Updated last year
- Python library with common functionality for writing web scrapers☆102Updated 10 years ago
- Removes duplicate files from specified folders☆47Updated 10 years ago
- Scraper built with Scrapy.☆18Updated last year
- This is a heroku buildpack for Pelican.☆23Updated 3 years ago