scalingpythonml / scaling-python-with-dask
A work-in-progress book on Dask
☆12Updated last year
Alternatives and similar repositories for scaling-python-with-dask:
Users that are interested in scaling-python-with-dask are comparing it to the libraries listed below
- Lambda Learner is a library for iterative incremental training of a class of supervised machine learning models.☆42Updated last year
- Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients☆36Updated 3 years ago
- Record matching and entity resolution at scale in Spark☆32Updated last year
- Scaling Python Machine Learning☆45Updated last year
- A series of workshop modules introducing Feast feature store.☆19Updated 2 years ago
- Serverless Python with Ray☆54Updated 2 years ago
- ☄️ Parallel and distributed training with spaCy and Ray☆53Updated last year
- example how to perform distributed bayesian optimisation (autoML) using optuna on metaflow☆10Updated 3 years ago
- Python binding for DataFusion☆59Updated 2 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- ☆54Updated last year
- ☆15Updated 5 years ago
- The Internals of PySpark☆25Updated 2 weeks ago
- Train Gradient Boosting and Random Forest with only SQL (VLDB 2023)☆21Updated last year
- Pandas helper functions☆30Updated last year
- Distributed Bayesian Entity Resolution in Apache Spark☆57Updated 3 years ago
- ☆104Updated last year
- Spark implementation of computing Shapley Values using monte-carlo approximation☆74Updated last year
- Python library to run ML/data pipelines on stateless compute infrastructure (that may be ephemeral or serverless). Please see the documen…☆18Updated last year
- Introduction to Ray Core Design Patterns and APIs.☆66Updated last year
- PySpark phonetic and string matching algorithms☆37Updated 10 months ago
- Unified Distributed Execution☆51Updated 2 months ago
- Delta Lake helper methods. No Spark dependency.☆22Updated 4 months ago
- ☆30Updated 3 years ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated 9 months ago
- Point-in-Time optimizations for Apache Spark☆29Updated last year
- Projects developed by Domino's R&D team☆76Updated 2 years ago
- ☆22Updated 2 years ago