adobe / ml-featurizerLinks
ML Featurizer is a library to enable users to create additional features from raw data with ease
☆14Updated last year
Alternatives and similar repositories for ml-featurizer
Users that are interested in ml-featurizer are comparing it to the libraries listed below
Sorting:
- Python bindings for Apache Tika☆24Updated 5 years ago
- stop word lists in several languages☆21Updated 8 years ago
- Hadoop integration code for working with with Apache cTAKES☆10Updated 12 years ago
- Way to run Uima Pipelines on Apache Spark☆10Updated 4 years ago
- Apache UIMA Ruta☆18Updated 3 months ago
- Uses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.☆38Updated last year
- Combines Apache OpenNLP and Apache Tika and provides facilities for automatically deriving sentiment from text.☆34Updated 2 years ago
- Stanford CoreNLP NER addon for Apache Tika's NamerEntityParser☆13Updated 3 years ago
- Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.☆108Updated 10 months ago
- Analytic UIMA pipelines using Spark☆24Updated 10 years ago
- A toolkit for clustering web pages based on various similarity measures.☆34Updated 4 years ago
- ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (image…☆95Updated 7 years ago
- A small java library for NLP Interchange Format (NIF) for NER(D) systems☆10Updated 3 years ago
- ☆22Updated 9 years ago
- Apache UIMA uimaFIT☆32Updated last year
- MITIE: library and tools for information extraction☆29Updated 11 years ago
- Examples of spark-lucenerdd☆15Updated 2 years ago
- SemanticVectors creates semantic WordSpace models from free natural language text.☆221Updated 3 years ago
- Babel Street Analytics Client Library for Python☆38Updated last month
- Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.☆284Updated 7 years ago
- ☆21Updated 11 years ago
- For extracting measurements and related entities from text☆58Updated 5 years ago
- General Architecture for Text Engineering☆49Updated 9 years ago
- ☆54Updated 7 years ago
- Elwha is a Java application for monitoring topics, sentiment and events on Twitter streams with the ability to generate notification mess…☆17Updated 10 years ago
- Train a Word2Vec model or LSA model, and Implement Conceptual Search\Semantic Search in Solr\Lucene - Simon Hughes Dice.com, Dice Tech Jo…☆259Updated 6 years ago
- SecureGraph, similar to Blueprints but secure☆37Updated 8 years ago
- Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.☆201Updated last month
- Entity Extraction Text Processor☆149Updated 2 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆71Updated 6 years ago