Data-Centric Pipelines and Data Versioning
β6,293Feb 3, 2025Updated last year
Alternatives and similar repositories for pachyderm
Users that are interested in pachyderm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Machine Learning Toolkit for Kubernetesβ15,654May 24, 2026Updated last week
- π¦ Data Versioning and ML Experimentsβ15,638Updated this week
- Open Source AI Infra & Engineering Control Planeβ3,706Apr 26, 2026Updated last month
- An MLOps framework to package, deploy, monitor and manage thousands of production machine learning modelsβ4,752Mar 23, 2026Updated 2 months ago
- Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visβ¦β18,723May 19, 2026Updated last week
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Build, Manage and Deploy AI/ML Systemsβ10,110Updated this week
- Workflow Engine for Kubernetesβ16,727Updated this week
- High-Performance Serverless event and data processing platformβ5,724Updated this week
- Dynamic, resilient AI orchestration. Coordinate data, models, and compute as you build AI workflows.β7,046May 24, 2026Updated last week
- The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, aβ¦β26,210Updated this week
- Production infrastructure for machine learning at scaleβ8,017Jun 12, 2024Updated last year
- high-performance graph database for real-time use casesβ21,672Updated this week
- Parallel computing with task schedulingβ13,845May 22, 2026Updated last week
- π Parameterize, execute, and analyze notebooksβ6,447May 12, 2026Updated 2 weeks ago
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Prefect is a workflow orchestration framework for building resilient data pipelines in Python.β22,500Updated this week
- Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering andβ¦β10,868May 22, 2026Updated last week
- The Open Source Feature Store for AI/MLβ7,052May 23, 2026Updated last week
- Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or β¦β3,561May 8, 2026Updated 3 weeks ago
- An orchestration platform for the development, production, and observation of data assets.β15,565May 25, 2026Updated last week
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.β42,709Updated this week
- Apache Airflow - A platform to programmatically author, schedule, and monitor workflowsβ45,588Updated this week
- An open-source graph databaseβ15,043May 5, 2026Updated 3 weeks ago
- Kafka implemented in Golang with built-in coordination (No ZK dep, single binary install, Cloud Native)β5,009May 20, 2026Updated last week
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A crazy fast analytical database, built on bitmaps. Perfect for ML applications. Learn more at: http://docs.featurebase.com/. Start a Docβ¦β2,526Feb 21, 2024Updated 2 years ago
- Glow is an easy-to-use distributed computation system written in Go, similar to Hadoop Map Reduce, Spark, Flink, Storm, etc. I am also woβ¦β3,219Nov 2, 2018Updated 7 years ago
- Easy and Repeatable Kubernetes Developmentβ15,827May 21, 2026Updated last week
- Always know what to expect from your data.β11,525May 21, 2026Updated last week
- CockroachDB β the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placemenβ¦β32,163Updated this week
- Gorgonia is a library that helps facilitate machine learning in Go.β5,916Aug 12, 2024Updated last year
- OpenFaaS - Serverless Functions Made Simpleβ26,158Apr 1, 2026Updated 2 months ago
- Apache Superset is a Data Visualization and Data Exploration Platformβ72,971May 24, 2026Updated last week
- Open Source ML Model Versioning, Metadata, and Experiment Managementβ1,747Jul 23, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.β28,613May 6, 2026Updated 3 weeks ago
- Quilt is a Scientific Data Management Platform on AWS that helps teams and AI find, trust, and reuse data through deeply versioned, conteβ¦β1,363May 25, 2026Updated last week
- The versioned, forkable, syncable databaseβ7,424Aug 27, 2021Updated 4 years ago
- Gonum is a set of numeric libraries for the Go programming language. It contains libraries for matrices, statistics, optimization, and moβ¦β8,370May 4, 2026Updated 3 weeks ago
- A next-generation curated knowledge sharing platform for data scientists and other technical professions.β5,533Sep 4, 2024Updated last year
- Fancy stream processing made operationally mundaneβ8,676Updated this week
- The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!β8,657May 7, 2026Updated 3 weeks ago