richardanaya / spark_delta_lake
☆16Updated 4 years ago
Alternatives and similar repositories for spark_delta_lake:
Users that are interested in spark_delta_lake are comparing it to the libraries listed below
- A simple introduction to using spark ml pipelines☆26Updated 6 years ago
- Some wrappers around python modules for simplifying the data exploration process.☆13Updated 2 months ago
- An example PySpark project with pytest☆17Updated 7 years ago
- Sample repo for luigi tasks & config☆36Updated 8 years ago
- Some notebook examples related to Apache Spark, IPython / Jupyter, Zeppelin☆52Updated 8 years ago
- ☆15Updated 2 years ago
- Supporting content (slides and exercises) for the Addison-Wesley (Pearson) video series covering best practices for developing scalable S…☆66Updated 9 years ago
- Conversion utility from Zeppelin notes to Jupyter notebooks.☆44Updated 5 years ago
- Natural Language Processing with Spark's MLlib☆62Updated 7 years ago
- ☕⛵WIP PySpark dependency management☆22Updated 6 years ago
- Data validation library for PySpark 3.0.0☆34Updated 2 years ago
- Tutorial repo for the article "ML in Production"☆30Updated last year
- Real-world Spark pipelines examples☆83Updated 6 years ago
- PySpark phonetic and string matching algorithms☆39Updated 11 months ago
- Test suite to document the behavior of Spark☆21Updated 3 years ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Updated last year
- Asynchronous actions for PySpark☆47Updated 3 years ago
- Analytics on Apache Projects for Diversity☆18Updated 5 years ago
- ⭕️ Minimum Viable Machine Learning☆33Updated 4 years ago
- Know your ML Score based on Sculley's paper☆34Updated 5 years ago
- A couple projects using scikit-learn illustrating project decision making.☆15Updated 8 years ago
- Just a boilerplate for PySpark and Flask☆35Updated 6 years ago
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Updated 6 years ago
- Apache Spark under Docker☆9Updated 8 years ago
- Code and setup information for Introduction to Machine Learning with Spark☆12Updated 9 years ago
- Repository used for Spark Trainings☆53Updated last year
- Machines and people collaborating together through Jupyter notebooks.☆18Updated 7 years ago
- Using Luigi to create a Machine Learning Pipeline using the Rossman Sales data from Kaggle☆33Updated 8 years ago
- Utilities for writing tests that use Apache Spark.☆24Updated 6 years ago
- Workshop for Spark and Databricks☆54Updated 5 years ago