tcarette / vim-sparkShell
control spark-shell from vim
☆10Updated 7 years ago
Related projects: ⓘ
- ☆32Updated this week
- functionstest☆33Updated 7 years ago
- A pyspark lib to validate data quality☆18Updated last year
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Updated 10 months ago
- Utilities for writing tests that use Apache Spark.☆24Updated 5 years ago
- Dependency and data pipeline management framework for Spark and Scala☆15Updated 7 years ago
- Utilities to work with Scala/Java code with py4j☆40Updated 8 months ago
- A Giter8 template for scio☆29Updated 2 months ago
- ☕⛵WIP PySpark dependency management☆22Updated 6 years ago
- Test suite to document the behavior of Spark☆21Updated 3 years ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆60Updated 2 weeks ago
- Functional Airflow DAG definitions.☆38Updated 7 years ago
- ☆37Updated this week
- Simple Spark example of generating table stats for use of data quality checks☆28Updated 7 years ago
- A library for exporting Spark ML models and pipelines to PFA☆54Updated 5 years ago
- Deprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead☆53Updated 6 years ago
- CLI tool for syncing a Databricks folder structure with a local git repo.☆17Updated last month
- Data Science with Apache Spark and Spark Notebook☆30Updated 7 years ago
- something to help you spark☆65Updated 5 years ago
- Conversion utility from Zeppelin notes to Jupyter notebooks.☆44Updated 4 years ago
- Additional useful algorithms that can be used with spark.☆24Updated 9 years ago
- Examples for High Performance Spark☆15Updated 3 weeks ago
- An Apache Spark-shell backend for IPython☆107Updated 3 years ago
- ☆21Updated this week
- A library you can include in your Spark job to validate the counters and perform operations on success. Goal is scala/java/python support…☆106Updated 6 years ago
- Scala implementation of Histogrammar, with optional front-ends and back-ends as separate Maven projects.☆15Updated 8 months ago
- A framework for creating composable and pluggable data processing pipelines using Apache Spark, and running them on a cluster.☆48Updated 8 years ago
- Parquet Command-line Tools☆18Updated 7 years ago
- MLeap allows for easily putting Spark ML pipelines into production☆78Updated 7 years ago
- Apache Spark AWS Lambda Executor (SAMBA)☆44Updated 6 years ago