amesar / spark-python-scala-udf
Demonstrates calling a Scala UDF from Python using spark-submit with an EGG and JAR
☆21Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for spark-python-scala-udf
- Example project showing how to use Hive UDFs in Apache Spark☆55Updated 5 years ago
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 5 years ago
- Asynchronous actions for PySpark☆45Updated 2 years ago
- Structured Streaming Machine Learning example with Spark 2.0☆92Updated 7 years ago
- The iterative broadcast join example code.☆69Updated 7 years ago
- ☆71Updated 3 years ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Updated last year
- Additional useful algorithms that can be used with spark.☆24Updated 9 years ago
- ☆33Updated 5 years ago
- functionstest☆33Updated 8 years ago
- Example Maven configuration for a Spark, Scala project☆54Updated 2 years ago
- A sample implementation of the Spark Datasource API☆24Updated 7 years ago
- Training materials for Strata, AMP Camp, etc☆150Updated 8 years ago
- Cheatsheet for Spark DataFrame☆91Updated 4 years ago
- [ARCHIVED] Moved to github.com/NVIDIA/spark-xgboost-examples☆70Updated 4 years ago
- Spark Implementation of Google Facets Overview https://github.com/PAIR-code/facets☆54Updated last year
- Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol☆34Updated 2 years ago
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆72Updated 3 years ago
- Based off the design of SparkOnHBase. This Repo will support Spark, Spark Streaming, and Spark SQL integration with Kudu.☆51Updated 8 years ago
- PySpark phonetic and string matching algorithms☆35Updated 8 months ago
- type-class based data cleansing library for Apache Spark SQL☆79Updated 5 years ago
- Spark-Radiant is Apache Spark Performance and Cost Optimizer☆25Updated 2 years ago
- Example project to show how to use Spark to read and write Avro/Parquet files☆50Updated 11 years ago
- Filling in the Spark function gaps across APIs☆50Updated 3 years ago
- Spark package for checking data quality☆221Updated 4 years ago
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Updated 6 years ago