jupyter-incubator / sparkmagic
Jupyter magics and kernels for working with remote Spark clusters
☆1,315Updated last month
Related projects: ⓘ
- Mirror of Apache Toree (Incubating)☆737Updated 2 weeks ago
- MLeap: Deploy ML Pipelines to Production☆1,499Updated 2 months ago
- ☆510Updated 2 years ago
- Livy is an open source REST interface for interacting with Apache Spark from anywhere☆1,008Updated last year
- The Internals of Apache Spark☆1,461Updated this week
- Koalas: pandas API on Apache Spark☆3,329Updated 5 months ago
- ☆990Updated 2 months ago
- This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spa…☆692Updated last month
- Deep Learning Pipelines for Apache Spark☆1,989Updated last year
- Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.☆880Updated this week
- XML data source for Spark SQL and DataFrames☆500Updated last month
- Essential Spark extensions and helper methods ✨😲☆747Updated 2 years ago
- Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)☆727Updated last week
- PySpark + Scikit-learn = Sparkit-learn☆1,152Updated 3 years ago
- Qubole Sparklens tool for performance tuning Apache Spark☆561Updated 2 months ago
- A curated list of awesome Apache Spark packages and resources.☆1,686Updated 5 months ago
- Examples for High Performance Spark☆497Updated 3 weeks ago
- Base classes to use when writing tests with Spark☆1,509Updated 2 months ago
- (Deprecated) Scikit-learn integration package for Apache Spark☆1,079Updated 4 years ago
- CSV Data Source for Apache Spark 1.x☆1,053Updated 5 years ago
- Python interface to Hive and Presto. 🐝☆1,670Updated last month
- Data Lineage Tracking And Visualization Solution☆596Updated last week
- Sparkling Water provides H2O functionality inside Spark cluster☆961Updated last month
- A command-line tool for launching Apache Spark clusters.☆637Updated 2 months ago
- Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks☆360Updated 7 years ago
- Docker build for Apache Spark☆676Updated 2 years ago
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,252Updated last week
- A Scala kernel for Jupyter☆1,591Updated 3 weeks ago
- A Spark plugin for reading and writing Excel files☆460Updated 3 weeks ago
- TonY is a framework to natively run deep learning frameworks on Apache Hadoop.☆703Updated 11 months ago