vsmolyakov / pyspark
spark (scala and python)
☆18Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for pyspark
- ☆15Updated 2 years ago
- Code for my presentation: Using PySpark to Process Boat Loads of Data☆20Updated 7 years ago
- pyspark sample scripts☆17Updated 5 years ago
- ☆19Updated 3 years ago
- Companion code for my PyData talk: "Introduction to Probabilistic Programming with PyMC3"☆13Updated 5 years ago
- Blog post on ETL pipelines with Airflow☆23Updated 4 years ago
- A Scalable Data Cleaning Library for PySpark.☆26Updated 5 years ago
- ☆26Updated 10 months ago
- Spark NLP for Streamlit☆15Updated 3 years ago
- How to do data science with Optimus, Spark and Python.☆18Updated 5 years ago
- Repository for medium article☆22Updated 10 months ago
- Tutorial repo for the article "ML in Production"☆30Updated last year
- ☆11Updated 6 years ago
- This is all my random garbage.☆26Updated last year
- Work for Mastering Large Datasets with Python☆18Updated last year
- Visualization ideas for data science☆19Updated 6 years ago
- Know your ML Score based on Sculley's paper☆34Updated 5 years ago
- Predict whether a student will correctly answer a problem based on past performance using automated feature engineering☆32Updated 4 years ago
- Techniques & resources for training interpretable ML models, explaining ML models, and debugging ML models.☆21Updated 2 years ago
- Distributed, large-scale, benchmarking framework for rigorous assessment of automatic machine learning repositories, projects, and librar…☆30Updated 2 years ago
- ☆15Updated 6 years ago
- H2OAI Driverless AI Code Samples and Tutorials☆37Updated 3 weeks ago
- ☆16Updated 5 years ago
- This repository contains code files specifically IPython notebooks for the assignments in the course "Scalable Machine Learning" by UC Be…☆30Updated 9 years ago
- ☆14Updated 9 years ago
- GA Data Science NY Section 7☆27Updated 10 years ago
- PyCon 2017 tutorial on time series analysis☆72Updated 7 years ago
- Record matching and entity resolution at scale in Spark☆31Updated last year
- PyDataLondonTutorial☆26Updated 8 years ago