andreybratus / RefineOnSparkLinks
☆33Updated 10 years ago
Alternatives and similar repositories for RefineOnSpark
Users that are interested in RefineOnSpark are comparing it to the libraries listed below
Sorting:
- Apache NiFi NLP Processor☆18Updated last year
- BatchRefine adds batch processing capabilities to OpenRefine☆50Updated 8 years ago
- A single docker image that combines Neo4j Mazerunner and Apache Spark GraphX into a powerful all-in-one graph processing engine☆46Updated 5 years ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 8 years ago
- Cascading on Apache Flink®☆54Updated last year
- ☆111Updated 8 years ago
- An open-source, vendor-neutral data context service.☆159Updated 7 years ago
- spark-sparql-connector☆17Updated 9 years ago
- CDAP Applications☆43Updated 7 years ago
- Blazegraph Tinkerpop3 Implementation☆61Updated 4 years ago
- Storm / Solr Integration☆19Updated last year
- Simple Spark example of generating table stats for use of data quality checks☆28Updated 8 years ago
- Docker image for apache zeppelin☆38Updated 8 years ago
- Additional useful algorithms that can be used with spark.☆24Updated 10 years ago
- ☆61Updated 8 months ago
- Power BI API adapter for Apache Spark (deprecated)☆26Updated 7 years ago
- ☆41Updated 7 years ago
- Spark implementation of the Google Correlate algorithm to quickly find highly correlated vectors in huge datasets☆93Updated 9 years ago
- Mirror of Apache Stanbol (incubating)☆112Updated last year
- ☆24Updated 9 years ago
- InsightEdge Core☆20Updated last year
- An example project for doing grid search in MLlib☆13Updated 10 years ago
- Spooker is a dynamic framework for processing high volume data streams via processing pipelines☆29Updated 9 years ago
- Comprises the whole SANSA stack☆15Updated 4 years ago
- A collection of tools for accessing Neo4j graph databases from Apache NiFi.☆23Updated 6 years ago
- Use Cascading Taps and Scalding DSL with Spark☆49Updated 8 years ago
- Templates for projects based on top of H2O.☆38Updated 2 months ago
- Reproducing Distributed Systems and Experiments on Cloud☆39Updated last year
- ODPi specifications, developed by ODPi Runtime and ODPi Operations projects. Currently in Emeritus status☆35Updated 6 years ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆61Updated 8 months ago