Data-Centric Pipelines and Data Versioning
β6,287Feb 3, 2025Updated last year
Alternatives and similar repositories for pachyderm
Users that are interested in pachyderm are comparing it to the libraries listed below
Sorting:
- Machine Learning Toolkit for Kubernetesβ15,496Jan 5, 2026Updated 2 months ago
- π¦ Data Versioning and ML Experimentsβ15,421Mar 2, 2026Updated last week
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycleβ3,698Updated this week
- Workflow Engine for Kubernetesβ16,495Mar 5, 2026Updated last week
- Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visβ¦β18,691Updated this week
- An MLOps framework to package, deploy, monitor and manage thousands of production machine learning modelsβ4,732Mar 1, 2026Updated last week
- Build, Manage and Deploy AI/ML Systemsβ9,903Mar 5, 2026Updated last week
- High-Performance Serverless event and data processing platformβ5,682Mar 4, 2026Updated last week
- Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.β6,833Updated this week
- high-performance graph database for real-time use casesβ21,639Mar 4, 2026Updated last week
- Production infrastructure for machine learning at scaleβ8,028Jun 12, 2024Updated last year
- The open source developer platform to build AI agents and models with confidence. Enhance your AI applications with end-to-end tracking, β¦β24,619Updated this week
- An orchestration platform for the development, production, and observation of data assets.β15,080Updated this week
- Prefect is a workflow orchestration framework for building resilient data pipelines in Python.β21,782Updated this week
- Parallel computing with task schedulingβ13,760Mar 5, 2026Updated last week
- π Parameterize, execute, and analyze notebooksβ6,397Updated this week
- Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering andβ¦β10,781Updated this week
- The Open Source Feature Store for AI/MLβ6,778Updated this week
- Apache Airflow - A platform to programmatically author, schedule, and monitor workflowsβ44,510Updated this week
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.β41,617Updated this week
- Easy and Repeatable Kubernetes Developmentβ15,767Mar 5, 2026Updated last week
- Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or β¦β3,555Mar 4, 2026Updated last week
- OpenFaaS - Serverless Functions Made Simpleβ26,109Updated this week
- An open-source graph databaseβ15,036Nov 22, 2025Updated 3 months ago
- Kafka implemented in Golang with built-in coordination (No ZK dep, single binary install, Cloud Native)β5,009Nov 13, 2023Updated 2 years ago
- A crazy fast analytical database, built on bitmaps. Perfect for ML applications. Learn more at: http://docs.featurebase.com/. Start a Docβ¦β2,530Feb 21, 2024Updated 2 years ago
- Apache Superset is a Data Visualization and Data Exploration Platformβ70,860Updated this week
- Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.β28,255Mar 2, 2026Updated last week
- Fancy stream processing made operationally mundaneβ8,604Updated this week
- Always know what to expect from your data.β11,224Updated this week
- Cadence is a distributed, scalable, durable, and highly available orchestration engine to execute asynchronous long-running business logiβ¦β9,198Mar 5, 2026Updated last week
- CockroachDB β the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placemenβ¦β32,055Updated this week
- Gorgonia is a library that helps facilitate machine learning in Go.β5,912Aug 12, 2024Updated last year
- Glow is an easy-to-use distributed computation system written in Go, similar to Hadoop Map Reduce, Spark, Flink, Storm, etc. I am also woβ¦β3,221Nov 2, 2018Updated 7 years ago
- The lightweight, fault-tolerant database built on SQLite. Designed to keep your data highly available with minimal effort.β17,338Updated this week
- Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analyticsβ16,563Updated this week
- The versioned, forkable, syncable databaseβ7,433Aug 27, 2021Updated 4 years ago
- Quilt is a data mesh for connecting people with actionable dataβ1,357Updated this week
- lakeFS - Data version control for your data lake | Git for dataβ5,192Updated this week