pachyderm / pachydermLinks
Data-Centric Pipelines and Data Versioning
☆6,256Updated 8 months ago
Alternatives and similar repositories for pachyderm
Users that are interested in pachyderm are comparing it to the libraries listed below
Sorting:
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle☆3,682Updated last month
- Machine Learning Toolkit for Kubernetes☆15,233Updated 2 months ago
- High-Performance Serverless event and data processing platform☆5,591Updated this week
- An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models☆4,648Updated this week
- 📚 Parameterize, execute, and analyze notebooks☆6,282Updated last week
- Production infrastructure for machine learning at scale☆8,034Updated last year
- 🦉 Data Versioning and ML Experiments☆14,948Updated this week
- Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.☆6,531Updated this week
- The Open Source Feature Store for AI/ML☆6,383Updated this week
- Quilt is a data mesh for connecting people with actionable data☆1,348Updated this week
- Open Source ML Model Versioning, Metadata, and Experiment Management☆1,740Updated last year
- Build, Manage and Deploy AI/ML Systems☆9,590Updated this week
- A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin☆6,455Updated last month
- Python Stream Processing☆6,824Updated last year
- A GPU-powered real-time analytics storage and query engine.☆3,065Updated last year
- Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, vis…☆18,522Updated 4 months ago
- Cadence is a distributed, scalable, durable, and highly available orchestration engine to execute asynchronous long-running business logi…☆8,929Updated this week
- Parallel computing with task scheduling☆13,518Updated this week
- the portable Python dataframe library☆6,136Updated last week
- Real-time Data Integration and Transformation: use SQL to transform, deliver, and act on fast-changing data.☆6,135Updated this week
- A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow☆2,083Updated last year
- ♾️ CML - Continuous Machine Learning | CI/CD for ML☆4,139Updated 4 months ago
- A system for quickly generating training data with weak supervision☆5,923Updated last year
- Multi-user server for Jupyter notebooks☆8,147Updated this week
- Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics☆16,031Updated this week
- Beaker Extensions for Jupyter Notebook☆2,832Updated last year
- An open-source graph database☆14,981Updated 3 months ago
- lakeFS - Data version control for your data lake | Git for data☆4,907Updated last week
- Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet f…☆1,859Updated 3 weeks ago
- Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and…☆10,570Updated this week