pachyderm / pachyderm
Data-Centric Pipelines and Data Versioning
β6,211Updated last month
Alternatives and similar repositories for pachyderm:
Users that are interested in pachyderm are comparing it to the libraries listed below
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycleβ3,615Updated 3 weeks ago
- Machine Learning Toolkit for Kubernetesβ14,768Updated this week
- π Parameterize, execute, and analyze notebooksβ6,113Updated 2 months ago
- High-Performance Serverless event and data processing platformβ5,410Updated this week
- Production infrastructure for machine learning at scaleβ8,030Updated 9 months ago
- Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visβ¦β18,175Updated last month
- An MLOps framework to package, deploy, monitor and manage thousands of production machine learning modelsβ4,488Updated this week
- Parallel computing with task schedulingβ13,045Updated this week
- Grumpy is a Python to Go source code transcompiler and runtime.β10,547Updated 3 years ago
- Visualizations for machine learning datasetsβ7,364Updated last year
- Build, Deploy and Manage AI/ML Systemsβ8,660Updated this week
- M3 monorepo - Distributed TSDB, Aggregator and Query Engine, Prometheus Sidecar, Graphite Compatible, Metrics Platformβ4,808Updated this week
- PipelineAIβ4,171Updated 11 months ago
- An open-source graph databaseβ14,902Updated last week
- Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.β6,109Updated this week
- π The interactive computing suite for you! β¨β6,238Updated last year
- NumPy and Pandas interface to Big Dataβ3,196Updated last year
- Open Source ML Model Versioning, Metadata, and Experiment Managementβ1,719Updated 8 months ago
- Kubernetes Native Serverless Frameworkβ6,871Updated 3 years ago
- Upspin: A framework for naming everyone's everything.β6,383Updated last month
- Kafka implemented in Golang with built-in coordination (No ZK dep, single binary install, Cloud Native)β4,971Updated last year
- A next-generation curated knowledge sharing platform for data scientists and other technical professions.β5,508Updated 6 months ago
- A curated list of awesome pipeline toolkits inspired by Awesome Sysadminβ6,309Updated 2 weeks ago
- Cadence is a distributed, scalable, durable, and highly available orchestration engine to execute asynchronous long-running business logiβ¦β8,550Updated this week
- Feather: fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrowβ2,741Updated 3 years ago
- Machine Learning Platform and Recommendation Engine built on Kubernetesβ1,471Updated 4 years ago
- Quilt is a data mesh for connecting people with actionable dataβ1,332Updated this week
- Beaker Extensions for Jupyter Notebookβ2,809Updated last year
- P2P Docker registry capable of distributing TBs of data in secondsβ6,296Updated this week
- Pinball is a scalable workflow managerβ1,045Updated 5 years ago