hollingsworthd / ScreenSlicer
Automatic, zero-config web scraping -- written in Java, has no dependency on Java EE or app servers, and the web scraper has a restful/JSON API. Currently unmaintained.
☆155Updated 7 years ago
Alternatives and similar repositories for ScreenSlicer:
Users that are interested in ScreenSlicer are comparing it to the libraries listed below
- Blog crawler for the blogforever project.☆22Updated 11 years ago
- WARC (Web Archive) Input and Output Formats for Hadoop☆35Updated 10 years ago
- Fabric3 Platform Repository☆25Updated 7 years ago
- Face Detection. Modified version of http://code.google.com/p/jviolajones/☆63Updated 8 years ago
- cron-like jobs for back-end systems☆76Updated 6 years ago
- A simple proxy web service in 19 lines of Python code.☆23Updated 10 years ago
- ☆48Updated 7 years ago
- How to spot first stories on Twitter using Storm.☆125Updated last year
- Simple Python scripts to download all Hacker News submissions and comments and store them in a PostgreSQL database.☆120Updated 7 years ago
- Sikuli-Slides is a visual automation tool that enables users to automate and test Graphical User Interfaces (GUIs) using presentation sli…☆65Updated 8 years ago
- A collection of efficient utilities for a data scientist.☆41Updated 9 years ago
- ☆21Updated 9 years ago
- A small Java library for simple text analysis - counting strings, identifying languages, and removing stop words.☆156Updated 5 years ago
- A library for extracting tables from PDF files☆90Updated 11 years ago
- An alternative take on Java object relational mapping☆51Updated 5 months ago
- A stream of deduplicated tweets built using RxJava and Twitter4J