richardanaya / spark_delta_lakeLinks
☆16Updated 5 years ago
Alternatives and similar repositories for spark_delta_lake
Users that are interested in spark_delta_lake are comparing it to the libraries listed below
Sorting:
- Sample repo for luigi tasks & config☆36Updated 9 years ago
- scaffold of Apache Airflow executing Docker containers☆86Updated 2 years ago
- Supporting content (slides and exercises) for the Addison-Wesley (Pearson) video series covering best practices for developing scalable S…☆67Updated 9 years ago
- Deep Learning for Pugs☆74Updated 8 years ago
- Some notebook examples related to Apache Spark, IPython / Jupyter, Zeppelin☆52Updated 9 years ago
- Installation guide for Apache Spark + Hadoop on Mac/Linux☆60Updated 7 years ago
- Splittable SAS (.sas7bdat) Input Format for Hadoop and Spark SQL☆93Updated 2 years ago
- Analyzing Clickstream Data using Markov Chains and data mining SPACE algorithm☆29Updated 7 years ago
- Open source Flotilla☆195Updated last week
- Conversion utility from Zeppelin notes to Jupyter notebooks.☆44Updated 5 years ago
- Natural Language Processing with Spark's MLlib☆62Updated 7 years ago
- A simple introduction to using spark ml pipelines☆26Updated 7 years ago
- Material for some talks I have given☆62Updated last year
- Know your ML Score based on Sculley's paper☆34Updated 6 years ago
- Learn the pyspark API through pictures and simple examples☆170Updated 4 years ago
- A couple projects using scikit-learn illustrating project decision making.☆15Updated 8 years ago
- Magic functions for using Jupyter Notebook with Apache Spark and a variety of SQL databases.☆171Updated 6 years ago
- A short guide for transitioning from Python to Scala☆65Updated 9 years ago
- Content for architecting a data science platform for products using Luigi, Spark & Flask.☆163Updated 5 years ago
- ☆20Updated 8 years ago
- Workshop for Spark and Databricks☆54Updated 5 years ago
- Tutorial repo for the article "ML in Production"☆30Updated 2 years ago
- Common data science and data engineering utilities to help us perform analytics. Our toolbox for data scientists, licensed under Apache-2…☆30Updated 7 years ago
- pyspark sample scripts☆17Updated 6 years ago
- A luigi powered analytics / warehouse stack☆88Updated 8 years ago
- Analyzing NBA data using Spark 2.1☆46Updated 8 years ago
- All information related to the LOAD CSV meetup / webinar.☆86Updated 5 years ago
- Github mirror of "analytics/refinery" - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access…☆19Updated this week
- An external PySpark module that works like R's read.csv or Panda's read_csv, with automatic type inference and null value handling. Parse…☆90Updated 9 years ago
- HDP Data Science/Machine Learning demo☆37Updated 10 years ago