hollingsworthd / ScreenSlicer
Automatic, zero-config web scraping -- written in Java, has no dependency on Java EE or app servers, and the web scraper has a restful/JSON API. Currently unmaintained.
☆155Updated 7 years ago
Alternatives and similar repositories for ScreenSlicer:
Users that are interested in ScreenSlicer are comparing it to the libraries listed below
- WARC (Web Archive) Input and Output Formats for Hadoop☆35Updated 10 years ago
- ☆20Updated 8 years ago
- A Java 8 library to extract main image from a URL or website link☆23Updated 5 years ago
- OpenBlock is a web application and RESTful service that allows users to browse and search their local area for "hyper-local news☆61Updated 3 years ago
- cron-like jobs for back-end systems☆76Updated 6 years ago
- Private talks made easy ... for robots☆41Updated this week
- A java annotation library for Web scraping.☆28Updated 2 weeks ago
- A collection of efficient utilities for a data scientist.☆41Updated 9 years ago
- A stream of deduplicated tweets built using RxJava and Twitter4J☆10Updated 9 years ago
- How to spot first stories on Twitter using Storm.☆125Updated last year
- Repackaging of Boilerpipe published on Maven Central Repository.☆53Updated last year
- A simple proxy web service in 19 lines of Python code.☆23Updated 10 years ago
- Bixo is an open source web mining toolkit that runs as a series of Cascading pipes on top of Hadoop. By building a customized Cascading p…☆142Updated 2 years ago
- Examples for my book "Power Java"☆21Updated 2 years ago
- JasDB OpenSource NoSQL Java based Database☆35Updated 9 months ago
- Chatbots is a library for the Processing programming language and environment that provides classes implementing a variety of chatter-bot…☆56Updated 14 years ago
- Blog crawler for the blogforever project.☆22Updated 11 years ago
- speedy and simplistic static site generator.☆28Updated 2 years ago
- Depreciated, use project scrape-itebooks☆32Updated 9 years ago
- Provides a standalone version of the JShell REPL. Anything needed to run JShell independently is contained, so there is no need to instal…☆40Updated last year
- Slinky, a high-performance web crawler / text analytics in Python, Redis, Hadoop, R, Gephi☆41Updated 14 years ago
- ☆29Updated last week
- convenient web rss-reader☆51Updated last year
- natural language processing with link-grammar☆18Updated 15 years ago
- a lightweight feature-toggle library for Java☆18Updated 12 years ago
- A fast and easy to use decision tree learner in java☆232Updated 2 years ago
- XTractor is an algorithmic text extractor from web pages written in Java. It builds upon the "commonly used web design practices" approac…☆43Updated 9 years ago
- ☆25Updated 9 years ago
- A flexible pure-Java OCR implementation. Eventually.☆20Updated 10 years ago
- A model-view based code generator written in Java☆41Updated 8 years ago