momer / nutch-selenium
☆28Updated 8 years ago
Alternatives and similar repositories for nutch-selenium:
Users that are interested in nutch-selenium are comparing it to the libraries listed below
- A Nutch 2.2.1 plugin which allows users to shuffle off the responsibility for retrieving pages to a selenium hub/node spoke system. This …☆16Updated 8 years ago
- A Query Autofiltering SearchComponent for Solr that can translate free-text queries into structured queries using index metadata☆28Updated 6 years ago
- Storm / Solr Integration☆19Updated last year
- Fabric-based framework for deploying and managing SolrCloud clusters in the cloud.☆90Updated 6 years ago
- ☆36Updated 9 years ago
- Beyond Piwik Analytics with Scala and Apache Spark☆46Updated 10 years ago
- SIREn - Semi-Structured Information Retrieval Engine☆107Updated 3 years ago
- Using latent Dirichlet allocation (LDA) in Apache Lucene☆58Updated 12 years ago
- Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.☆281Updated 6 years ago
- Solr Redis Extensions☆52Updated last year
- A Real-Time Analytical Processing (RTAP) example using Spark/Shark☆51Updated 11 years ago
- A new object-graph-wrapper for the Tinkerpop 3 graph stack.☆40Updated 4 years ago
- Apache Spark applications☆70Updated 7 years ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 7 years ago
- An example stand alone program to import CSV files into Apache Cassandra using Apache Spark☆19Updated 9 years ago
- Bixo is an open source web mining toolkit that runs as a series of Cascading pipes on top of Hadoop. By building a customized Cascading p…☆142Updated 2 years ago
- Cascading on Apache Flink®☆54Updated last year
- Mirror of Apache Spark☆57Updated 9 years ago
- ☆20Updated 2 years ago
- A single docker image that combines Neo4j Mazerunner and Apache Spark GraphX into a powerful all-in-one graph processing engine☆46Updated 5 years ago
- WARC (Web Archive) Input and Output Formats for Hadoop☆35Updated 10 years ago
- Elasticsearch plugin for b-bit minhash algorism☆62Updated 9 months ago
- Calculating Value at Risk with Spark☆56Updated 10 years ago
- Fusion demo app searching open-source project data from the Apache Software Foundation☆42Updated 6 years ago
- Mirror of Apache Stanbol (incubating)☆112Updated last year
- An implementation of locality sensitive hashing with Hadoop☆57Updated 10 years ago
- RDF-Centric Map/Reduce Framework and Freebase data conversion tool☆148Updated 3 years ago
- Code to index HDFS to Solr using MapReduce☆52Updated 6 years ago
- Next-generation web analytics processing with Scala, Spark, and Parquet.☆331Updated 10 years ago
- TinkerPop3 (Moved To Apache TinkerPop)☆214Updated 8 years ago