joblib/joblib-spark

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/joblib/joblib-spark)

joblib / joblib-spark

Joblib Apache Spark Backend

☆250

Alternatives and similar repositories for joblib-spark

Users that are interested in joblib-spark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Affirm / shparkley
View on GitHub
Spark implementation of computing Shapley Values using monte-carlo approximation
☆80Mar 20, 2023Updated 3 years ago
mrpowers-io / quinn
View on GitHub
pyspark methods to enhance developer productivity 📣 👯 🎉
☆686Jun 9, 2026Updated last month
cemoody / flexi_hash_embedding
View on GitHub
PyTorch Flexible Hash Embeddings
☆29Feb 4, 2020Updated 6 years ago
matloff / FarewellAddress
View on GitHub
☆23Jun 4, 2023Updated 3 years ago
ThinkBigAnalytics / pyspark-distributed-kmodes
View on GitHub
☆24Jan 8, 2019Updated 7 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
uber / petastorm
View on GitHub
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet f…
☆1,889Jan 2, 2026Updated 6 months ago
hvanhovell / weld-java
View on GitHub
JVM integration for Weld
☆16Sep 24, 2018Updated 7 years ago
dhirensk / ai
View on GitHub
Personal AI Implementations
☆16Jul 29, 2020Updated 5 years ago
alteryx / featuretools
View on GitHub
An open source python library for automated feature engineering
☆7,663Updated this week
squito / spark-memory
View on GitHub
A tool to get better debug info on spark's memory usage
☆42Aug 21, 2019Updated 6 years ago
avulanov / ann-benchmark
View on GitHub
Benchmarks of artificial neural network library for Spark MLlib
☆11Dec 3, 2015Updated 10 years ago
wxhC3SC6OPm8M1HXboMy / spark-mrmr-feature-selection
View on GitHub
Machine learning enhancements to Spark MlLib
☆20Mar 19, 2015Updated 11 years ago
microsoft / SynapseML
View on GitHub
Simple and Distributed Machine Learning
☆5,231Jun 25, 2026Updated 2 weeks ago
koaning / scikit-lego
View on GitHub
Extra blocks for scikit-learn pipelines.
☆1,401Jul 2, 2026Updated last week
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
svenkreiss / pysparkling
View on GitHub
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
☆270Sep 3, 2024Updated last year
graphframes / graphframes
View on GitHub
GraphFrames is a package for Apache Spark which provides DataFrame-based Graphs
☆1,190Jun 23, 2026Updated 2 weeks ago
sparklingpandas / sparklingml
View on GitHub
Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)
☆73Nov 9, 2023Updated 2 years ago
fugue-project / tutorials
View on GitHub
Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…
☆113Nov 10, 2025Updated 8 months ago
LucaCanali / sparkMeasure
View on GitHub
This repository contains the development code for sparkMeasure, an Apache Spark performance analysis and troubleshooting library. It simp…
☆827May 19, 2026Updated last month
julioasotodv / spark-df-profiling
View on GitHub
Create HTML profiling reports from Apache Spark DataFrames
☆197Feb 2, 2020Updated 6 years ago
paiqo / Databricks-VSCode
View on GitHub
VSCode extension to work with Databricks
☆135Updated this week
NVIDIA / spark-xgboost-examples
View on GitHub
XGBoost GPU accelerated on Spark example applications
☆52Aug 3, 2022Updated 3 years ago
SgfdDttt / sara
View on GitHub
StAtutory Reasoning Assessment
☆17Dec 8, 2022Updated 3 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
MrPowers / chispa
View on GitHub
PySpark test helper methods with beautiful error messages
☆771May 20, 2026Updated last month
brendanhasz / tfp-taxi
View on GitHub
Taxi fare prediction using tensorflow probability
☆15Jul 23, 2019Updated 6 years ago
jefftriplett / python-actions-alpha-archived
View on GitHub
Please note that this was for the *alpha* version of GitHub Actions for Python.
☆15Jan 29, 2019Updated 7 years ago
modin-project / modin
View on GitHub
Modin: Scale your Pandas workflows by changing a single line of code
☆10,391Feb 10, 2026Updated 4 months ago
MrPowers / bebe
View on GitHub
Filling in the Spark function gaps across APIs
☆50Apr 14, 2021Updated 5 years ago
scikit-learn-contrib / imbalanced-learn
View on GitHub
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
☆7,107Jun 29, 2026Updated last week
G-Research / spark-extension
View on GitHub
A library that provides useful extensions to Apache Spark and PySpark.
☆238Jul 1, 2026Updated last week
maxpumperla / elephas
View on GitHub
Distributed Deep learning with Keras & Spark
☆1,579May 1, 2023Updated 3 years ago
vaexio / vaex
View on GitHub
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per s…
☆8,507Apr 1, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
szilard / GBM-perf
View on GitHub
Performance of various open source GBM implementations
☆224Feb 17, 2026Updated 4 months ago
Raschka-research-group / corn-ordinal-neuralnet
View on GitHub
Code and experiments for "Deep Neural Networks for Rank Consistent Ordinal Regression based on Conditional Probabilities"
☆29May 23, 2022Updated 4 years ago
twosigma / flint
View on GitHub
A Time Series Library for Apache Spark
☆1,166Jul 3, 2020Updated 6 years ago
japila-books / apache-spark-internals
View on GitHub
The Internals of Apache Spark
☆1,547Apr 12, 2026Updated 2 months ago
ryanchao2012 / airfly
View on GitHub
Auto Generate Airflow's dag.py On The Fly
☆14Feb 10, 2025Updated last year
feast-dev / feast
View on GitHub
The Open Source Feature Store for AI/ML
☆7,123Updated this week
delta-io / delta-sharing
View on GitHub
An open protocol for secure data sharing
☆951Updated this week