elsevierlabs-os / spark-xml-utilsLinks
☆61Updated 10 months ago
Alternatives and similar repositories for spark-xml-utils
Users that are interested in spark-xml-utils are comparing it to the libraries listed below
Sorting:
- Simple Spark app that reads and writes Avro data☆31Updated 10 years ago
- Banana RDF☆297Updated 2 years ago
- CM-Well - a data warehouse for your knowledge graph☆180Updated 2 years ago
- Examples for different graph dbs☆66Updated 5 years ago
- Comprises the whole SANSA stack☆15Updated 4 years ago
- spark-sparql-connector☆17Updated 9 years ago
- A Scala DSL for programming with the OWL API.☆57Updated last year
- Chalk is a natural language processing library.☆260Updated 8 years ago
- Spark RDD with Lucene's query and entity linkage capabilities☆128Updated last month
- Bucketing and partitioning system for Parquet☆30Updated 7 years ago
- Scalable query engine for web scrapping/data mashup/acceptance QA, powered by Apache Spark☆142Updated last week
- ☆23Updated 5 years ago
- RDF store on a cloud-based architecture (previously on https://code.google.com/p/cumulusrdf)☆31Updated 9 years ago
- ☆71Updated 7 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆71Updated 5 years ago
- Use Cascading Taps and Scalding DSL with Spark☆49Updated 8 years ago
- SKOS Support for Apache Lucene and Solr☆56Updated 4 years ago
- Scala client for the Lightning data visualization server (WIP)☆47Updated 6 years ago
- SolrCloud HAFT is a High Availability and Fault Tolerant Framework for SolrCloud☆30Updated 8 years ago
- A framework for scalable graph computing.☆147Updated 7 years ago
- Secondary sort and streaming reduce for Apache Spark☆78Updated 2 years ago
- An RDF plugin for Solr☆115Updated 6 months ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 8 years ago
- A library you can include in your Spark job to validate the counters and perform operations on success. Goal is scala/java/python support…☆109Updated 7 years ago
- A framework for creating composable and pluggable data processing pipelines using Apache Spark, and running them on a cluster.☆47Updated 9 years ago
- Experiments with the GDELT dataset and Cassandra schemas.☆25Updated 9 years ago
- ☆92Updated 8 years ago
- Storm / Solr Integration☆19Updated last year
- Mirror of Apache Stanbol (incubating)☆112Updated last year
- Spark implementation of the Google Correlate algorithm to quickly find highly correlated vectors in huge datasets☆93Updated 9 years ago