ucbrise / flor
๐ป Flow with FlorDB
โ151Updated 2 months ago
Related projects โ
Alternatives and complementary repositories for flor
- Inspect ML Pipelines in Python in the form of a DAGโ69Updated 8 months ago
- Distribution transparent Machine Learning experiments on Apache Sparkโ90Updated 9 months ago
- The Data Linter identifies potential issues (lints) in your ML training data.โ87Updated 6 years ago
- Lambda Learner is a library for iterative incremental training of a class of supervised machine learning models.โ42Updated last year
- Coarse-grained lineage and tracing for machine learning pipelines.โ468Updated 2 years ago
- A portable Pythonic Data Catalog API powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture tโฆโ166Updated 2 weeks ago
- Ray-based Apache Beam runnerโ42Updated last year
- Ibis Substrait Compilerโ95Updated this week
- Unified specification for defining and executing ML workflows, making reproducibility, consistency, and governance easier across the ML pโฆโ89Updated 7 months ago
- MLCubeยฎ is a project that reduces friction for machine learning by ensuring that models are easily portable and reproducible.โ154Updated 2 months ago
- RAPIDS GPU-BDBโ107Updated 8 months ago
- A library that translates Python and NumPy to optimized distributed systems code.โ131Updated 2 years ago
- Willump Is a Low-Latency Useful Machine learning Platform.โ43Updated last year
- Materials for Apache Arrow workshop at VLDB 2019โ42Updated 4 years ago
- An open-source, vendor-neutral data context service.โ159Updated 6 years ago
- A Ray-based data loader with per-epoch shuffling and configurable pipelining, for shuffling and loading training data for distributed traโฆโ18Updated last year
- Pandas ExtensionDType/Array backed by Apache Arrowโ229Updated last year
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.โ298Updated 5 months ago
- A platform for online learning that curtails data latency and saves you cost.โ47Updated 2 years ago
- Distributed XGBoost on Rayโ144Updated 4 months ago
- A JSON-based schema for storing declarative descriptions of machine learning experimentsโ45Updated 7 years ago
- A tool and library for easily deploying applications on Apache YARNโ142Updated 8 months ago
- Data System for Optimized Deep Learning Model Selectionโ20Updated 2 years ago
- Avro2TF is designed to fill the gap of making users' training data ready to be consumed by deep learning training frameworks.โ126Updated 4 years ago
- A Scalable Auto-ML Systemโ51Updated last year
- Ray provider for Apache Airflowโ47Updated 9 months ago
- Distributed SQL Query Engine in Python using Rayโ239Updated last month
- A Python-to-SQL transpiler as replacement for Python Pandasโ47Updated last year
- yogadl, the flexible data layerโ74Updated last year
- A benchmark to measure performance of popular Gradient boosting algorithms against popular ML datasets.โ38Updated 2 years ago