uber / petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
☆1,817Updated last year
Alternatives and similar repositories for petastorm:
Users that are interested in petastorm are comparing it to the libraries listed below
- Open Source ML Model Versioning, Metadata, and Experiment Management☆1,716Updated 6 months ago
- MLeap: Deploy ML Pipelines to Production☆1,515Updated 2 months ago
- A low-latency prediction-serving system☆1,412Updated 3 years ago
- Automated Machine Learning on Kubernetes☆1,543Updated this week
- TFX is an end-to-end platform for deploying production ML pipelines☆2,127Updated last week
- Library for exploring and validating machine learning data☆768Updated 2 weeks ago
- Hopsworks - Data-Intensive AI platform with a Feature Store☆1,196Updated last week
- Model analysis tools for TensorFlow☆1,261Updated this week
- A uniform interface to run deep learning models from multiple frameworks☆936Updated last year
- Scalable Machine Learning with Dask☆916Updated last week
- An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models☆4,452Updated this week
- Input pipeline framework☆984Updated last week
- TonY is a framework to natively run deep learning frameworks on Apache Hadoop.☆708Updated last year
- For recording and retrieving metadata associated with ML developer and data scientist workflows.☆637Updated 3 months ago
- Distributed Computing for AI Made Simple☆1,041Updated last year
- The Open Source Feature Store for AI/ML☆5,815Updated this week
- Hummingbird compiles trained ML models into tensor computation for faster inference.☆3,388Updated 3 weeks ago
- Dataset, streaming, and file system extensions maintained by TensorFlow SIG-IO☆718Updated last week
- PyTorch elastic training☆730Updated 2 years ago
- cuML - RAPIDS Machine Learning Library☆4,406Updated this week
- Universal model exchange and serialization format for decision tree forests☆751Updated last month
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle☆3,604Updated this week
- High performance model preprocessing library on PyTorch☆651Updated 10 months ago
- A model-agnostic visual debugging tool for machine learning☆1,652Updated 2 weeks ago
- NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale da…☆1,070Updated 5 months ago
- Extended pickling support for Python objects☆1,702Updated last month
- A system for quickly generating training data with weak supervision☆5,830Updated 9 months ago
- BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.☆1,948Updated 2 years ago
- Sacred is a tool to help you configure, organize, log and reproduce experiments developed at IDSIA.☆4,281Updated 2 months ago
- Train and run Pytorch models on Apache Spark.☆340Updated last year