richardanaya / spark_delta_lakeLinks
☆16Updated 5 years ago
Alternatives and similar repositories for spark_delta_lake
Users that are interested in spark_delta_lake are comparing it to the libraries listed below
Sorting:
- Sample repo for luigi tasks & config☆36Updated 9 years ago
- ☆34Updated 9 years ago
- Open source Flotilla☆195Updated last week
- A couple projects using scikit-learn illustrating project decision making.☆15Updated 9 years ago
- Common data science and data engineering utilities to help us perform analytics. Our toolbox for data scientists, licensed under Apache-2…☆30Updated 7 years ago
- A short guide for transitioning from Python to Scala☆65Updated 10 years ago
- Deep Learning for Pugs☆74Updated 8 years ago
- scaffold of Apache Airflow executing Docker containers☆85Updated 3 years ago
- Magic functions for using Jupyter Notebook with Apache Spark and a variety of SQL databases.☆171Updated 7 years ago
- Content for architecting a data science platform for products using Luigi, Spark & Flask.☆163Updated 6 years ago
- Installation guide for Apache Spark + Hadoop on Mac/Linux☆60Updated 8 years ago
- An example mini data warehouse for python project stats, template for new projects☆178Updated 5 years ago
- An external PySpark module that works like R's read.csv or Panda's read_csv, with automatic type inference and null value handling. Parse…☆90Updated 10 years ago
- Material for some talks I have given☆61Updated last year
- ☆31Updated 9 years ago
- Supporting content (slides and exercises) for the Addison-Wesley (Pearson) video series covering best practices for developing scalable S…☆68Updated 10 years ago
- T4 is now in production as Quilt 3☆64Updated 6 years ago
- Tough and flexible tools for data analysis, transformation, validation and movement.☆140Updated 2 years ago
- Repo for building docker based airflow image. Containers support multiple features like writing logs to local or S3 folder and Initializi…☆32Updated 6 years ago
- Airflow plugin to transfer arbitrary files between operators☆78Updated 7 years ago
- Apache Avro <-> pandas DataFrame☆137Updated 5 months ago
- Airflow workflow management platform chef cookbook.☆70Updated 6 years ago
- ☆24Updated 7 years ago
- SQL on dataframes - pandas and dask☆64Updated 7 years ago
- ☆52Updated 9 years ago
- A simple example of containerized data science with python and Docker.☆51Updated 7 years ago
- Learn the pyspark API through pictures and simple examples☆170Updated 5 years ago
- An in depth tutorial on sklearn's Pipeline and FeatureUnion classes.☆16Updated 8 years ago
- Conversion utility from Zeppelin notes to Jupyter notebooks.☆43Updated 6 years ago
- All information related to the LOAD CSV meetup / webinar.☆86Updated 5 years ago