uber / petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
☆1,832Updated last year
Alternatives and similar repositories for petastorm:
Users that are interested in petastorm are comparing it to the libraries listed below
- Open Source ML Model Versioning, Metadata, and Experiment Management☆1,720Updated 9 months ago
- A low-latency prediction-serving system☆1,414Updated 4 years ago
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle☆3,633Updated 2 weeks ago
- Distributed Computing for AI Made Simple☆1,044Updated 2 years ago
- MLeap: Deploy ML Pipelines to Production☆1,515Updated 5 months ago
- TFX is an end-to-end platform for deploying production ML pipelines☆2,139Updated last week
- Hopsworks - Data-Intensive AI platform with a Feature Store☆1,222Updated 2 months ago
- Automated Machine Learning on Kubernetes☆1,573Updated last week
- An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models☆4,511Updated last week
- For recording and retrieving metadata associated with ML developer and data scientist workflows.☆643Updated last month
- A uniform interface to run deep learning models from multiple frameworks☆935Updated last year
- PyTorch elastic training☆730Updated 2 years ago
- Scalable Machine Learning with Dask☆930Updated 2 months ago
- Hummingbird compiles trained ML models into tensor computation for faster inference.☆3,438Updated 2 weeks ago
- High performance model preprocessing library on PyTorch☆650Updated last year
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,077Updated last month
- Serve, optimize and scale PyTorch models in production☆4,315Updated last week
- Kubeflow’s superfood for Data Scientists☆633Updated 2 years ago
- The Open Source Feature Store for AI/ML☆6,031Updated this week
- python implementation of the parquet columnar file format.☆829Updated last month
- Adaptive Experimentation Platform☆2,478Updated this week
- A distributed task scheduler for Dask☆1,626Updated this week
- Library for exploring and validating machine learning data☆769Updated this week
- Train and run Pytorch models on Apache Spark.☆339Updated last year
- A model-agnostic visual debugging tool for machine learning☆1,659Updated 3 months ago
- Sacred is a tool to help you configure, organize, log and reproduce experiments developed at IDSIA.☆4,297Updated 5 months ago
- Extended pickling support for Python objects☆1,749Updated last month
- GraphFrames is a package for Apache Spark which provides DataFrame-based Graphs☆1,047Updated 2 weeks ago
- NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale da…☆1,081Updated 8 months ago
- Multi Model Server is a tool for serving neural net models for inference☆1,009Updated 11 months ago