Data Exploration in PySpark made easy - Pyspark_dist_explore provides methods to get fast insights in your Spark DataFrames.
☆102Aug 20, 2019Updated 6 years ago
Alternatives and similar repositories for pyspark_dist_explore
Users that are interested in pyspark_dist_explore are comparing it to the libraries listed below
Sorting:
- Create HTML profiling reports from Apache Spark DataFrames☆197Feb 2, 2020Updated 6 years ago
- Tutorials for uisng PyDAAL, i.e. the Python API of Intel Data Analytics Acceleration Library☆11Apr 13, 2018Updated 7 years ago
- How to save a model for tfserving☆11Jan 13, 2018Updated 8 years ago
- data analysis, big data development, cloud, and any other cool things!☆31Jul 30, 2024Updated last year
- Sandbox for generating visualizations of the bias-variance tradeoff for Machine Learning at Berkeley's blog.☆13Jun 26, 2017Updated 8 years ago
- A POC of Google's Wide & Deep Learning models deployed on Google Cloud ML Engine for Kaggle's Outbrain Click Competition☆36Jun 19, 2018Updated 7 years ago
- more data science resources☆14Jun 4, 2022Updated 3 years ago
- SigOpt's public R client☆13Aug 22, 2023Updated 2 years ago
- Plática y demostración de como integrar Tensorflow con R☆16May 23, 2019Updated 6 years ago
- Example R Shiny Application on Heroku☆19Aug 31, 2022Updated 3 years ago
- Materials from the Data Science with Spark and R☆21Nov 15, 2018Updated 7 years ago
- I developed this case study only in 7 days with Pyspark (Spark 1.6.0) SQL & MLlib. I used Databricks cluster and AWS. %90 AUC is achieved…☆17May 7, 2016Updated 9 years ago
- Keep track of your results☆19Aug 3, 2020Updated 5 years ago
- My code for the kaggle Cats and Dogs Redux competition. Placed in top 8%.☆13Mar 23, 2017Updated 8 years ago
- ☆20Aug 20, 2016Updated 9 years ago
- Machine learning framework for electronic structure prediction of molecules☆19Sep 5, 2017Updated 8 years ago
- Demonstrates calling a Scala UDF from Python using spark-submit with an EGG and JAR☆23Mar 3, 2020Updated 6 years ago
- Business Data Analysis by HiPIC of CalStateLA☆21Oct 26, 2018Updated 7 years ago
- WSGI adapter for AWS API Gateway/Lambda Proxy Integration. Mirrored from GitLab.☆19Sep 4, 2018Updated 7 years ago
- Analyzing NBA data using Spark 2.1☆47Feb 1, 2017Updated 9 years ago
- The mlr package online tutorial☆20Jul 20, 2018Updated 7 years ago
- ☆14Aug 9, 2017Updated 8 years ago
- We write sample code for two tower models for retrieval and add RLHF/RLAIF style alignment with a ranking model to make the retrieval mor…☆106Feb 9, 2025Updated last year
- PySpark Machine Learning Examples☆45Mar 8, 2018Updated 8 years ago
- An R-based, httr-style interface for the Power BI REST API.☆20May 9, 2017Updated 8 years ago
- Materials for "Machine Learning on Big Data" course☆22Jul 23, 2023Updated 2 years ago
- Compare methods of face landmarks☆24Jun 29, 2018Updated 7 years ago
- ☆24Jan 8, 2019Updated 7 years ago
- An R package that makes lightgbm models fully interpretable (take reference from https://github.com/AppliedDataSciencePartners/xgboostExp…☆23Aug 8, 2019Updated 6 years ago
- Updated repository☆157Nov 25, 2021Updated 4 years ago
- How to use SHAP values for better cluster analysis☆60May 15, 2022Updated 3 years ago
- An R package providing access to the OpenAI Gym API☆21Jul 1, 2017Updated 8 years ago
- Minimal example to setup a Jenkins-CI pipeline for data science projects on OpenShift in a couple of minutes.☆27Jan 7, 2025Updated last year
- MLflow samples - deprecated☆22May 9, 2023Updated 2 years ago
- Introduction to Shiny workshop for satRday conference☆25Feb 13, 2017Updated 9 years ago
- https://www.kaggle.com/c/microsoft-malware-prediction/leaderboard☆22Mar 14, 2019Updated 6 years ago
- Apache Spark (Scala, PySpark, SparkR) Code, Tricks, and References☆69Jan 21, 2019Updated 7 years ago
- Lasagne / Theano tutorials for Nvidia Deep Learning Summercamp 2016☆26Sep 29, 2016Updated 9 years ago
- Apache Spark 2x Machine Learning Cookbook, published by Packt☆33Jul 23, 2025Updated 7 months ago