Data-Centric Pipelines and Data Versioning
β6,288Feb 3, 2025Updated last year
Alternatives and similar repositories for pachyderm
Users that are interested in pachyderm are comparing it to the libraries listed below
Sorting:
- Machine Learning Toolkit for Kubernetesβ15,482Jan 5, 2026Updated 2 months ago
- π¦ Data Versioning and ML Experimentsβ15,404Updated this week
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycleβ3,697Feb 25, 2026Updated last week
- Workflow Engine for Kubernetesβ16,481Updated this week
- Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visβ¦β18,683Updated this week
- An MLOps framework to package, deploy, monitor and manage thousands of production machine learning modelsβ4,731Updated this week
- Build, Manage and Deploy AI/ML Systemsβ9,863Updated this week
- High-Performance Serverless event and data processing platformβ5,680Feb 26, 2026Updated last week
- Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.β6,767Updated this week
- high-performance graph database for real-time use casesβ21,625Feb 26, 2026Updated last week
- Production infrastructure for machine learning at scaleβ8,029Jun 12, 2024Updated last year
- The open source developer platform to build AI agents and models with confidence. Enhance your AI applications with end-to-end tracking, β¦β24,485Updated this week
- Prefect is a workflow orchestration framework for building resilient data pipelines in Python.β21,697Updated this week
- An orchestration platform for the development, production, and observation of data assets.β15,049Updated this week
- Parallel computing with task schedulingβ13,754Updated this week
- π Parameterize, execute, and analyze notebooksβ6,390Updated this week
- Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering andβ¦β10,771Feb 26, 2026Updated last week
- The Open Source Feature Store for AI/MLβ6,756Updated this week
- Apache Airflow - A platform to programmatically author, schedule, and monitor workflowsβ44,430Updated this week
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.β41,516Updated this week
- Easy and Repeatable Kubernetes Developmentβ15,759Updated this week
- Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or β¦β3,555Feb 20, 2026Updated last week
- OpenFaaS - Serverless Functions Made Simpleβ26,103Feb 22, 2026Updated last week
- An open-source graph databaseβ15,036Nov 22, 2025Updated 3 months ago
- Kafka implemented in Golang with built-in coordination (No ZK dep, single binary install, Cloud Native)β5,009Nov 13, 2023Updated 2 years ago
- A crazy fast analytical database, built on bitmaps. Perfect for ML applications. Learn more at: http://docs.featurebase.com/. Start a Docβ¦β2,531Feb 21, 2024Updated 2 years ago
- Apache Superset is a Data Visualization and Data Exploration Platformβ70,755Updated this week
- Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.β28,241Updated this week
- Fancy stream processing made operationally mundaneβ8,593Feb 25, 2026Updated last week
- Always know what to expect from your data.β11,197Updated this week
- Cadence is a distributed, scalable, durable, and highly available orchestration engine to execute asynchronous long-running business logiβ¦β9,193Updated this week
- CockroachDB β the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placemenβ¦β31,960Updated this week
- Gorgonia is a library that helps facilitate machine learning in Go.β5,913Aug 12, 2024Updated last year
- Glow is an easy-to-use distributed computation system written in Go, similar to Hadoop Map Reduce, Spark, Flink, Storm, etc. I am also woβ¦β3,220Nov 2, 2018Updated 7 years ago
- The lightweight, fault-tolerant database built on SQLite. Designed to keep your data highly available with minimal effort.β17,328Updated this week
- Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analyticsβ16,543Updated this week
- The versioned, forkable, syncable databaseβ7,434Aug 27, 2021Updated 4 years ago
- Quilt is a data mesh for connecting people with actionable dataβ1,357Updated this week
- Storage Orchestration for Kubernetesβ13,393Feb 26, 2026Updated last week