MachinePublishers / ScreenSlicer
Automatic, zero-config web scraping -- written in Java, has no dependency on Java EE or app servers, and the web scraper has a restful/JSON API. Currently unmaintained.
☆156Updated 7 years ago
Related projects: ⓘ
- A simple proxy web service in 19 lines of Python code.☆23Updated 9 years ago
- How to spot first stories on Twitter using Storm.☆124Updated 9 months ago
- ☆11Updated this week
- Blog crawler for the blogforever project.☆22Updated 10 years ago
- ☆20Updated 7 years ago
- Algorithmic summarizer for RSS/Atom Feeds, Web Urls and arbitrary text. Codebase for the application deployed at http://tldrzr.herokuapp.…☆53Updated 8 years ago
- ☆47Updated 7 years ago
- 'People who downloaded this paper also downloaded...'☆51Updated 11 years ago
- Mavenized version of Kelvin Tan's example (http://www.lucenetutorial.com/lucene-in-5-minutes.html)☆69Updated last month
- ☆28Updated 8 years ago
- A collection of efficient utilities for a data scientist.☆40Updated 9 years ago
- WARC (Web Archive) Input and Output Formats for Hadoop☆35Updated 9 years ago
- Akiva is a simple natural-language-processing, question-answering, artificial intelligence.☆351Updated 10 years ago
- Face Detection. Modified version of http://code.google.com/p/jviolajones/☆63Updated 7 years ago
- Create python web applications for Google Glass☆276Updated 10 years ago
- Chatbots is a library for the Processing programming language and environment that provides classes implementing a variety of chatter-bot…☆57Updated 14 years ago
- ☆12Updated this week
- ☆13Updated 8 years ago
- faceted search engine☆42Updated 9 years ago
- IPython Notebook Cookbook for Deployment via Chef☆41Updated 7 years ago
- A component based data flow framework with a drag-n-drop Web 2.0 interface. Based on Stackless Python and inspired by Yahoo! Pipes.☆150Updated 11 years ago
- Sikuli-Slides is a visual automation tool that enables users to automate and test Graphical User Interfaces (GUIs) using presentation sli…☆65Updated 8 years ago
- speedy and simplistic static site generator.☆27Updated last year
- The first Open Source document analysis platform☆65Updated 3 years ago
- Collects multimedia content shared through social networks.☆19Updated 9 years ago
- OUTDATED VERSION collective consciousness fiction generator☆47Updated 9 years ago
- Alenka JDBC is a library for accessing and manipulating data with the open-source GPU database Alenka.☆19Updated 10 years ago
- Slinky, a high-performance web crawler / text analytics in Python, Redis, Hadoop, R, Gephi☆41Updated 14 years ago
- Sometimes sites make crawling hard. Selenium-crawler uses selenium automation to fix that.☆125Updated 11 years ago
- Java based implementation of Unofficial Google Trends API☆92Updated 9 years ago