richardanaya / spark_delta_lakeLinks
☆16Updated 5 years ago
Alternatives and similar repositories for spark_delta_lake
Users that are interested in spark_delta_lake are comparing it to the libraries listed below
Sorting:
- Sample repo for luigi tasks & config☆36Updated 9 years ago
- Natural Language Processing with Spark's MLlib☆62Updated 8 years ago
- Conversion utility from Zeppelin notes to Jupyter notebooks.☆44Updated 5 years ago
- Learn the pyspark API through pictures and simple examples☆170Updated 4 years ago
- Installation guide for Apache Spark + Hadoop on Mac/Linux☆60Updated 8 years ago
- scaffold of Apache Airflow executing Docker containers☆86Updated 2 years ago
- Supporting content (slides and exercises) for the Addison-Wesley (Pearson) video series covering best practices for developing scalable S…☆67Updated 9 years ago
- ☕⛵WIP PySpark dependency management☆22Updated 7 years ago
- Open source Flotilla☆196Updated last week
- Splittable SAS (.sas7bdat) Input Format for Hadoop and Spark SQL☆94Updated 2 years ago
- A luigi powered analytics / warehouse stack☆88Updated 8 years ago
- Airflow workflow management platform chef cookbook.☆71Updated 6 years ago
- Some notebook examples related to Apache Spark, IPython / Jupyter, Zeppelin☆52Updated 9 years ago
- Common data science and data engineering utilities to help us perform analytics. Our toolbox for data scientists, licensed under Apache-2…☆30Updated 7 years ago
- An example mini data warehouse for python project stats, template for new projects☆178Updated 5 years ago
- Repository used for Spark Trainings☆54Updated 2 years ago
- Python bindings for the Domino APIs☆55Updated last week
- Material for some talks I have given☆62Updated last year
- Magic functions for using Jupyter Notebook with Apache Spark and a variety of SQL databases.☆171Updated 6 years ago
- Workshop for Spark and Databricks☆54Updated 5 years ago
- Run EMR workloads on EKS☆13Updated 4 years ago
- Materials for Apache Arrow workshop at VLDB 2019☆42Updated 5 years ago
- Data Exploration in PySpark made easy - Pyspark_dist_explore provides methods to get fast insights in your Spark DataFrames.☆102Updated 6 years ago
- Airflow plugin to transfer arbitrary files between operators☆78Updated 6 years ago
- HandySpark - bringing pandas-like capabilities to Spark dataframes☆196Updated 6 years ago
- ☆16Updated 3 years ago
- Cheatsheet for Spark DataFrame☆91Updated 5 years ago
- ☆34Updated 9 years ago
- Example unit tests for Apache Spark Python scripts using the py.test framework☆84Updated 9 years ago
- Deep Learning for Pugs☆74Updated 8 years ago