criteo / mlflow-yarn
Backend implementation for running MLFlow projects on Hadoop/YARN.
☆11Updated 2 years ago
Alternatives and similar repositories for mlflow-yarn:
Users that are interested in mlflow-yarn are comparing it to the libraries listed below
- A library on top of either pex or conda-pack to make your Python code easily available on a cluster☆45Updated 4 months ago
- Dockerized setup for testing code on realistic hadoop clusters☆27Updated 4 years ago
- Sparrow is a boosting algorithm implementation that is optimized for training on very large datasets and/or in the limited memory setting…☆21Updated 4 years ago
- Spawn JupyterHub single user notebook servers in Hadoop/YARN containers.☆19Updated 2 years ago
- A tool and library for easily deploying applications on Apache YARN☆143Updated last year
- The open-source Useful SDK. One python decorator in the Useful library allows for full observability of Python functions within an ETL.☆20Updated last year
- A data pipeline orchestration library for rapid iterative development with automatic cache invalidation allowing users to focus writing t…☆30Updated last week
- A conda-smithy repository for python-duckdb.☆13Updated 2 weeks ago
- ☆37Updated 5 years ago
- IbisML is a library for building scalable ML pipelines using Ibis.☆108Updated 3 months ago
- Tools for MLflow☆37Updated last year
- A Python package that parses sql and converts it to ibis expressions☆54Updated last year
- Mirror of Apache Arrow site☆38Updated this week
- ☕⛵WIP PySpark dependency management☆22Updated 6 years ago
- A framework for data piping in python☆37Updated last year
- Documentation and resources for deploying JupyterHub on Hadoop☆18Updated 5 years ago
- This repository is no longer maintained.☆15Updated 3 years ago
- A proof-of-concept repo that attempts to use Apache Superset with a custom ADBC to Arrow Flight SQL SQLAlchemy driver.☆24Updated last year
- Teradata SQL Extension for Jupyter☆26Updated this week
- RFC document, tooling and other content related to the dataframe API standard☆106Updated last year
- Dask integration for Snowflake☆30Updated 4 months ago
- Apache Arrow Cookbook☆101Updated last month
- The deepr module provide abstractions (layers, readers, prepro, metrics, config) to help build tensorflow models on top of tf estimators☆52Updated last year
- Apache Arrow Development Experiments☆17Updated 2 months ago
- ☆31Updated 11 months ago
- API for reading and writing data via various file transfer protocols from Apache Spark.☆21Updated 4 years ago
- ☆37Updated this week
- ☆9Updated 3 years ago
- Native polars deltalake reader☆9Updated 7 months ago
- 🎯 aimrocks 🎸 — python & cython bindings for RocksDB. Batteries included! 🔋☆31Updated last month