uber / petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
☆1,798Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for petastorm
- Open Source ML Model Versioning, Metadata, and Experiment Management☆1,700Updated 3 months ago
- A low-latency prediction-serving system☆1,404Updated 3 years ago
- Hopsworks - Data-Intensive AI platform with a Feature Store☆1,159Updated this week
- MLeap: Deploy ML Pipelines to Production☆1,503Updated 4 months ago
- Automated Machine Learning on Kubernetes☆1,507Updated this week
- Scalable Machine Learning with Dask☆900Updated 3 months ago
- Library for exploring and validating machine learning data☆763Updated last week
- TFX is an end-to-end platform for deploying production ML pipelines☆2,114Updated this week
- Distributed Computing for AI Made Simple☆1,043Updated last year
- Hummingbird compiles trained ML models into tensor computation for faster inference.☆3,352Updated 2 weeks ago
- For recording and retrieving metadata associated with ML developer and data scientist workflows.☆625Updated 2 weeks ago
- A uniform interface to run deep learning models from multiple frameworks☆936Updated 10 months ago
- Extended pickling support for Python objects☆1,656Updated 3 weeks ago
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle☆3,567Updated this week
- Adaptive Experimentation Platform☆2,375Updated this week
- NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale da…☆1,047Updated 2 months ago
- Model analysis tools for TensorFlow☆1,254Updated this week
- A system for quickly generating training data with weak supervision☆5,807Updated 6 months ago
- The Open Source Feature Store for Machine Learning☆5,592Updated this week
- High performance model preprocessing library on PyTorch☆648Updated 7 months ago
- PyTorch elastic training☆730Updated 2 years ago
- TonY is a framework to natively run deep learning frameworks on Apache Hadoop.☆708Updated last year
- A model-agnostic visual debugging tool for machine learning☆1,649Updated last year
- MLBox is a powerful Automated Machine Learning python library.☆1,500Updated last year
- Universal model exchange and serialization format for decision tree forests☆738Updated this week
- An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models☆4,377Updated this week
- Experiment tracking, ML developer tools☆870Updated last year
- Intake is a lightweight package for finding, investigating, loading and disseminating data.☆1,011Updated last month
- python implementation of the parquet columnar file format.☆783Updated last week