utcompling / OpenNLP-Models
A project for code to create models from existing corpora and distribute models.
☆42Updated 12 years ago
Related projects: ⓘ
- Document clustering based on Latent Semantic Analysis☆96Updated 14 years ago
- Apache Pig utilities to build training corpora for machine learning / NLP out of public Wikipedia and DBpedia dumps.☆158Updated last year
- ☆22Updated this week
- A Hadoop toolkit for web-scale information retrieval research☆79Updated 9 years ago
- simple simhashing in hadoop with cascading☆33Updated 13 years ago
- Mahout vector encoding for pig☆54Updated last year
- An example project for doing grid search in MLlib☆13Updated 9 years ago
- iSAX Indexing persisted in HBase☆39Updated 13 years ago
- Scala utilities for teaching computational linguistics and prototyping algorithms.☆42Updated 11 years ago
- Machine learning and natural language processing with Apache Pig☆53Updated 10 years ago
- NLP Utilities in Java☆43Updated last year
- Crux is a reporting application for HBase. Crux provides a simple web based graphical interface to access HBase, query data and create re…☆100Updated 11 years ago
- Python wrapper for the Vowpal Wabbit machine learning library.☆53Updated 11 years ago
- Website for standardized execution and evaluation of algorithms on datasets.☆36Updated 4 years ago
- distributed latent dirichlet allocation☆30Updated 12 years ago
- Example code to explore for using DL4J in Scala.☆19Updated 8 years ago
- My personal clojure library geared towards NLP applications☆40Updated 13 years ago
- Examples of use of pig scripting languages capabilities☆39Updated 8 years ago
- xlvector's solution of github contest☆33Updated 15 years ago
- RDF-Centric Map/Reduce Framework and Freebase data conversion tool☆148Updated 2 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆70Updated 4 years ago
- (deprecated) Please use new nlp4l instead.☆66Updated 7 years ago
- A Query Autofiltering SearchComponent for Solr that can translate free-text queries into structured queries using index metadata☆28Updated 5 years ago
- ☆11Updated this week
- Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.☆281Updated 6 years ago
- A RESTful web service that runs microtasks across multiple crowds, provides quality control techniques, and is easily extensible.☆51Updated 7 years ago
- Streaming Histograms for Clojure/Java☆154Updated 3 months ago
- A prototype of Hive UDFs/UDTFs that execute nested SQL queries within rows.☆54Updated 9 years ago
- Open source framework for predictive modeling on Apache Hadoop☆34Updated 10 years ago
- ***Warning*** Old Apache Flink Graph API: This repository is not in use anymore.☆16Updated 8 years ago