larsmans / lucene-stanford-lemmatizer
A library that adds some NLP capabilities to the Lucene search engine
☆50Updated 11 years ago
Related projects: ⓘ
- Apache Pig utilities to build training corpora for machine learning / NLP out of public Wikipedia and DBpedia dumps.☆158Updated last year
- A Hadoop toolkit for web-scale information retrieval research☆79Updated 9 years ago
- A toolkit that wraps various natural language processing implementations behind a common interface.☆101Updated 6 years ago
- Document clustering based on Latent Semantic Analysis☆96Updated 14 years ago
- simple simhashing in hadoop with cascading☆33Updated 13 years ago
- SIREn - Semi-Structured Information Retrieval Engine☆106Updated 3 years ago
- Large RDF hierarchies as vector spaces☆20Updated 10 years ago
- Search a single field with different query time analyzers in Solr☆25Updated 4 years ago
- Implementation of Tyler Neylon's Locality-Specific Hash based on simplex tesselations☆28Updated 12 years ago
- NLP tools developed by Emory University.☆60Updated 8 years ago
- A Query Autofiltering SearchComponent for Solr that can translate free-text queries into structured queries using index metadata☆28Updated 5 years ago
- KEA 5.0 (keyphrase extraction software), modified to be an XML-RPC service☆42Updated 13 years ago
- Elasticsearch Latent Semantic Indexing experimentation☆33Updated 4 years ago
- Using latent Dirichlet allocation (LDA) in Apache Lucene☆58Updated 11 years ago
- Movie recommendations and more in MapReduce and Scalding☆117Updated 11 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆70Updated 4 years ago
- Machine learning and natural language processing with Apache Pig☆53Updated 10 years ago
- A Stanford CoreNLP server, with example clients, using Apache Thrift.☆47Updated 5 years ago
- xlvector's solution of github contest☆33Updated 15 years ago
- ElasticSearch Prediction Generator and Plugin☆22Updated 9 years ago
- ☆22Updated this week
- Bulk loading for elastic search☆186Updated 9 months ago
- Jeremy's Machine Learning Library☆52Updated 8 years ago
- A RankLib based Solr Learning to Rank Plugin☆29Updated 2 years ago
- NLP Utilities in Java☆43Updated last year
- Analysis plugin for ElasticSearch providing capability for processing inline annotations in documents.☆35Updated 10 years ago
- Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.☆281Updated 6 years ago
- ☆20Updated 6 years ago
- The S-Space repsitory, from the AIrhead-Research group☆203Updated 3 years ago
- Mirror of Apache Stanbol (incubating)☆112Updated 6 months ago