trivago / hive-lambda-stingLinks

A small library of hive UDFS using Macros to process and manipulate complex types

☆15

Alternatives and similar repositories for hive-lambda-sting

Users that are interested in hive-lambda-sting are comparing it to the libraries listed below

Sorting:

funkyminds / cleanframes
type-class based data cleansing library for Apache Spark SQL
☆78Updated 6 years ago
swoop-inc / spark-records
Bulletproof Apache Spark jobs with fast root cause analysis of failures.
☆73Updated 4 years ago
yaooqinn / itachi
A library that brings useful functions from various modern database management systems to Apache Spark
☆60Updated 2 years ago
radanalyticsio / silex
something to help you spark
☆64Updated 7 years ago
AbsaOSS / atum
A dynamic data completeness and accuracy library at enterprise scale for Apache Spark
☆29Updated last year
zalando-incubator / spark-json-schema
JSON schema parser for Apache Spark
☆82Updated 3 years ago
springnz / sparkplug
A framework for creating composable and pluggable data processing pipelines using Apache Spark, and running them on a cluster.
☆47Updated 9 years ago
hortonworks-spark / spark-schema-registry
Schema Registry integration for Apache Spark
☆40Updated 2 years ago
indix / sparkplug
Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌
☆29Updated 5 years ago
indix / schemer
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
☆113Updated 5 years ago
FRosner / spawncamping-dds
Data-Driven Spark allows quick data exploration based on Apache Spark.
☆29Updated 8 years ago
FINRAOS / MegaSparkDiff
A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…
☆52Updated 4 months ago
rymurr / flight-spark-source
☆107Updated 2 years ago
BryanCutler / SparkArrowFlight
Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients
☆37Updated 4 years ago
datamindedbe / lighthouse
Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…
☆62Updated last year
ing-bank / scruid
Scala + Druid: Scruid. A library that allows you to compose queries in Scala, and parse the result back into typesafe classes.
☆115Updated 4 years ago
tideworks / arvo2parquet
Example program that writes Parquet formatted data to plain files (i.e., not Hadoop hdfs); Parquet is a columnar storage format.
☆38Updated 3 years ago
51zero / eel-sdk
Big Data Toolkit for the JVM
☆145Updated 5 years ago
SaurabhChawla100 / spark-radiant
Spark-Radiant is Apache Spark Performance and Cost Optimizer
☆25Updated 10 months ago
cerndb / SparkPlugins
Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are in…
☆94Updated 6 months ago
hbutani / spark-datetime
functionstest
☆33Updated 9 years ago
maropu / spark-sql-server
Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol
☆34Updated 3 years ago
bernhard-42 / pyspark-atlas
PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection
☆18Updated 8 years ago
holdenk / spark-validator
A library you can include in your Spark job to validate the counters and perform operations on success. Goal is scala/java/python support…
☆108Updated 7 years ago
nielsbasjes / splittablegzip
Splittable Gzip codec for Hadoop
☆74Updated last month
KeithSSmith / spark-compaction
File compaction tool that runs on top of the Spark framework.
☆59Updated 6 years ago
ZuInnoTe / spark-hadoopoffice-ds
A Spark datasource for the HadoopOffice library
☆37Updated last month
liquidm / druid-dumbo
☆21Updated 2 years ago
lightcopy / parquet-index
Spark SQL index for Parquet tables
☆134Updated 4 years ago
sparsecode / DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…
☆26Updated 4 years ago