eleflow / nutch-aws
☆25Updated 9 years ago
Alternatives and similar repositories for nutch-aws:
Users that are interested in nutch-aws are comparing it to the libraries listed below
- Apache Nutch fork tunned for web services and data discovery.☆9Updated 9 years ago
- A platform for real-time streaming search☆103Updated 8 years ago
- Additional opennlp mapping type for elasticsearch in order to perform named entity recognition☆136Updated 8 years ago
- Elasticsearch entity resolution plugin based on Duke☆210Updated 4 years ago
- Hadoop jobs for WikiReverse project. Parses Common Crawl data for links to Wikipedia articles.☆38Updated 6 years ago
- Coding exercises for Apache Spark☆104Updated 9 years ago
- Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.☆281Updated 6 years ago
- Analyze the structure and dynamics of an open source project's developer community, using graph algorithms, etc.☆57Updated 3 years ago
- A single docker image that combines Neo4j Mazerunner and Apache Spark GraphX into a powerful all-in-one graph processing engine☆46Updated 5 years ago
- Text classification using Naive Bayes and Elasticsearch☆154Updated 8 years ago
- Simple search results with Solr and EmberJS☆58Updated 5 years ago
- ☆28Updated 8 years ago
- ☆35Updated 2 years ago
- Analytic UIMA pipelines using Spark☆23Updated 9 years ago
- Elasticsearch Index Termlist☆117Updated 5 years ago
- A sample application that consumes from twitter using HBC and producing into Amazon Kinesis☆12Updated 9 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆71Updated 4 years ago
- Educational Examle of a custom Lucene Query & Scorer☆48Updated 4 years ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 7 years ago
- A highly configurable Google Cloud Dataflow pipeline that writes data into Google Big Query table from Pub/Sub☆67Updated 6 years ago
- A simple example application that will connect to the Twitter API, run a search, gather tweets, and then calculate the sentiment of each …☆65Updated 8 years ago
- Demonstration of using Python to process the Common Crawl dataset with the mrjob framework☆166Updated 2 years ago
- RDF-Centric Map/Reduce Framework and Freebase data conversion tool☆148Updated 3 years ago
- Import Salesforce data into Hadoop HDFS in Avro format☆23Updated 5 years ago
- Dice Solr Plugins from Simon Hughes Dice.com☆87Updated 3 years ago
- DBpedia.org RDF to CSV for import into Neo4j☆51Updated 9 years ago
- ☆146Updated 8 years ago