NVIDIA / ais-etlLinks
Provides for deploying custom ETL containers on AIStore, with subsequent user-defined extraction-transformation-loading in parallel, on the fly and/or offline, locally to user data.
☆19Updated 2 months ago
Alternatives and similar repositories for ais-etl
Users that are interested in ais-etl are comparing it to the libraries listed below
Sorting:
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆121Updated this week
- Aqueduct is no longer being maintained. Aqueduct allows you to run LLM and ML workloads on any cloud infrastructure.☆520Updated 2 years ago
- A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel app…☆357Updated this week
- Rayvens makes it possible for data scientists to access hundreds of data services within Ray with little effort.☆50Updated 3 years ago
- A portable Multimodal Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to you…☆267Updated last week
- Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.☆304Updated last week
- Morpheus Runtime Core (MRC)☆51Updated last week
- Ray-based Apache Beam runner☆42Updated 2 years ago
- ☆60Updated this week
- Utilities for Dask and CUDA interactions☆319Updated this week
- RAPIDS GPU-BDB☆108Updated last year
- MLCube® is a project that reduces friction for machine learning by ensuring that models are easily portable and reproducible.☆158Updated 2 months ago
- FlorDB 🌻☆158Updated 3 months ago
- KvikIO - High Performance File IO☆238Updated this week
- Curated examples and patterns for using Chalk. Use these to build your feature pipelines.☆26Updated last month
- Chassis turns machine learning models into portable container images that can run just about anywhere.☆86Updated last year
- A top-like tool for monitoring GPUs in a cluster☆84Updated last year
- Unified specification for defining and executing ML workflows, making reproducibility, consistency, and governance easier across the ML p…☆94Updated last year
- This repository contains example integrations between Determined and other ML products☆48Updated last year
- Unified storage framework for the entire machine learning lifecycle☆155Updated last year
- Simplifying the definition and execution, scaling and deployment of pipelines on the cloud.☆234Updated 2 years ago
- ☆283Updated 10 months ago
- Cluster Toolkit is an open-source software offered by Google Cloud which makes it easy for customers to deploy AI/ML and HPC environments…☆308Updated this week
- ☆30Updated 2 years ago
- Flyte Documentation 📖☆86Updated 9 months ago
- Distributed persistent Task Queue running on Dask☆38Updated 2 years ago
- 🪴 Nebari - your open source data science platform☆319Updated this week
- UnionML: the easiest way to build and deploy machine learning microservices☆336Updated 2 years ago
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.☆301Updated last year
- Python bindings for UCX☆140Updated 4 months ago