pachyderm / pachydermLinks
Data-Centric Pipelines and Data Versioning
☆6,274Updated 10 months ago
Alternatives and similar repositories for pachyderm
Users that are interested in pachyderm are comparing it to the libraries listed below
Sorting:
- High-Performance Serverless event and data processing platform☆5,625Updated last week
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle☆3,686Updated last week
- An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models☆4,688Updated last week
- Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, vis…☆18,594Updated 6 months ago
- Quilt is a data mesh for connecting people with actionable data☆1,354Updated last week
- Open Source ML Model Versioning, Metadata, and Experiment Management☆1,743Updated last year
- Machine Learning Toolkit for Kubernetes☆15,316Updated last month
- 📚 Parameterize, execute, and analyze notebooks☆6,333Updated last week
- Beaker Extensions for Jupyter Notebook☆2,835Updated 2 years ago
- A next-generation curated knowledge sharing platform for data scientists and other technical professions.☆5,536Updated last year
- Production infrastructure for machine learning at scale☆8,029Updated last year
- Build, Manage and Deploy AI/ML Systems☆9,659Updated last week
- Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.☆6,629Updated this week
- Parallel computing with task scheduling☆13,638Updated 2 weeks ago
- The Open Source Feature Store for AI/ML☆6,511Updated this week
- A crazy fast analytical database, built on bitmaps. Perfect for ML applications. Learn more at: http://docs.featurebase.com/. Start a Doc…☆2,523Updated last year
- Ready-to-run Docker images containing Jupyter applications☆8,377Updated last week
- An open-source graph database☆15,004Updated 2 weeks ago
- lakeFS - Data version control for your data lake | Git for data☆5,023Updated this week
- A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin☆6,486Updated last month
- BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.☆1,994Updated 3 years ago
- A curated list of awesome ETL frameworks, libraries, and software.☆3,494Updated last year
- Run your code in the cloud, with technology so advanced, it feels like magic!☆2,660Updated last week
- 🦉 Data Versioning and ML Experiments☆15,180Updated this week
- Feather: fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrow☆2,750Updated last week
- Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics☆16,241Updated this week
- Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per s…☆8,456Updated last month
- A low-latency prediction-serving system☆1,420Updated 4 years ago
- Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet f…☆1,867Updated last month
- NumPy and Pandas interface to Big Data☆3,199Updated 2 years ago