adipolak / ml-with-apache-sparkView external linksLinks
A series of Jupyter notebooks that walk you through Machine Learning with Apache Spark ecosystem using Spark MLlib, PyTorch and TensorFlow.
☆86Oct 12, 2023Updated 2 years ago
Alternatives and similar repositories for ml-with-apache-spark
Users that are interested in ml-with-apache-spark are comparing it to the libraries listed below
Sorting:
- Scaling Machine Learning in Three Week course in a collaboration with O'Reilly following the guidance of Adi Polak's book - Scaling Machi…☆23May 12, 2023Updated 2 years ago
- ☆13Sep 9, 2024Updated last year
- Use a hybrid solver to select features from two data sets☆16Oct 14, 2025Updated 4 months ago
- PostgreSQL + Grafana with test data running in Docker Compose. This is the repo used for the talk I gave at PostgresConf NYC 2019.☆11Sep 16, 2021Updated 4 years ago
- Official code for ICML 2024 paper "An Unsupervised Approach for Periodic Source Detection in Time Series"☆13Feb 21, 2025Updated 11 months ago
- Data files for the main dysts repository☆21Sep 23, 2025Updated 4 months ago
- Artificial Intelligence for Big Data, published by Packt☆17Feb 5, 2026Updated last week
- Customized Jupyter Spark Docker images with everything you need☆16May 3, 2025Updated 9 months ago
- Monotonic Optimal Binning algorithm is a statistical approach to transform continuous variables into optimal and monotonic categorical va…☆17Nov 6, 2025Updated 3 months ago
- lakeFS airflow operator☆27Oct 23, 2023Updated 2 years ago
- Full Machine Learning Lifecycle using Airflow, MLflow, and AWS S3☆26Mar 28, 2023Updated 2 years ago
- Deploy any Machine Learning model serverless in AWS.☆24Oct 17, 2018Updated 7 years ago
- GitHub Repo for the UChicago, Spring 2021 course *Are We Doomed? Confronting the End of the World*☆12Mar 30, 2021Updated 4 years ago
- A set of tools that make working with the Scala ecosystem even better.☆12Feb 6, 2026Updated last week
- AI enhanced automation tool for financial modelling and market analysis.☆11Sep 10, 2019Updated 6 years ago
- Predict if a reservation will be canceled using robust Machine Learning pipelines with Airflow and Mlflow☆66Jan 12, 2024Updated 2 years ago
- Open Benchmarks for Evaluating the Performance of Feature Stores☆38Mar 17, 2024Updated last year
- A Scala library for Firestore in Datastore mode☆13Jun 11, 2024Updated last year
- ☆10Aug 6, 2024Updated last year
- ☆15Apr 23, 2025Updated 9 months ago
- breast Cancer乳腺癌数据挖掘,python sklearn☆11Apr 13, 2019Updated 6 years ago
- A Python function for bootstrapping☆10Nov 5, 2019Updated 6 years ago
- AQIPython is a Python module that calculates the Air Quality Index (AQI) for various air pollutants based on different standards.☆10Mar 5, 2024Updated last year
- A fun little data analysis project to whether American prefers Mexican food over Italian food or Chinese Food.☆12Sep 11, 2017Updated 8 years ago
- RabbitMQ producer and consumer example with fastapi☆12Jan 25, 2023Updated 3 years ago
- List of SIC codes and descriptions from authoritative sources☆12Mar 14, 2017Updated 8 years ago
- Self-optimizing nth order Savitzky-Golay filter☆12Aug 31, 2023Updated 2 years ago
- ☆10Feb 13, 2024Updated 2 years ago
- My solutions for the Udacity Data Engineering Nanodegree☆34Oct 14, 2019Updated 6 years ago
- A python wrapper for the QuantAQ RESTful API☆11Dec 24, 2025Updated last month
- Build Your Own Neural Network Design☆14Aug 3, 2020Updated 5 years ago
- ☆14Sep 17, 2025Updated 4 months ago
- Scraper for aqicn.org☆11Sep 4, 2018Updated 7 years ago
- ☆11Jan 13, 2024Updated 2 years ago
- full code written for the Twilio blog https://www.twilio.com/blog/media-file-storage-python-flask-amazon-s3-buckets☆11May 4, 2024Updated last year
- ☆14Dec 12, 2024Updated last year
- ARCHIVED A high-performance database of shipment-level CITES trade data☆12May 11, 2023Updated 2 years ago
- a simple lakeFS webhook for pre-commit and pre-merge validation of data objects☆12Nov 9, 2023Updated 2 years ago
- Kafka library with a schema registry integration☆10Dec 16, 2025Updated last month