xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerators such as TPUs and GPUs on GKE.
☆183May 14, 2026Updated 3 weeks ago
Alternatives and similar repositories for xpk
Users that are interested in xpk are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆104Updated this week
- Package of Pathways-on-Cloud utilities☆30Jun 1, 2026Updated last week
- KJob: Tool for CLI-loving ML researchers☆43Jun 1, 2026Updated last week
- ☆36Updated this week
- Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.☆133Updated this week
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆15Apr 8, 2026Updated 2 months ago
- ☆356Updated this week
- A simple, performant and scalable Jax LLM!☆2,304Jun 2, 2026Updated last week
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆443Jan 5, 2026Updated 5 months ago
- a Jax quantization library☆119Updated this week
- ☆153May 29, 2026Updated last week
- ☆582Jul 11, 2024Updated last year
- PathwaysJob API is an OSS Kubernetes-native API, to deploy ML training and batch inference workloads, using Pathways on GKE.☆21Oct 22, 2025Updated 7 months ago
- Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimenta…☆554Updated this week
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- AI on GKE is a collection of examples, best-practices, and prebuilt solutions to help build, deploy, and scale AI Platforms on Google Kub…☆328Jun 23, 2025Updated 11 months ago
- ☆355Apr 13, 2026Updated last month
- JobSet: a k8s native API for distributed ML training and HPC workloads☆326Updated this week
- ☆197May 4, 2026Updated last month
- Collection of tools and examples for managing Accelerated workloads in Kubernetes Engine☆254Jun 2, 2026Updated last week
- Orbax provides common checkpointing and persistence utilities for JAX users☆518Updated this week
- Google TPU optimizations for transformers models☆137Jan 23, 2026Updated 4 months ago
- ☆31Jun 2, 2026Updated last week
- jax-triton contains integrations between JAX and OpenAI Triton☆461Jun 1, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆82Dec 18, 2025Updated 5 months ago
- Library for reading and processing ML training data.☆740Updated this week
- Kubernetes-native Job Queueing☆2,539Updated this week
- Two implementations of ZeRO-1 optimizer sharding in JAX☆14Jun 11, 2023Updated 2 years ago
- ☆21May 9, 2026Updated last month
- ☆15May 11, 2025Updated last year
- Task-based datasets, preprocessing, and evaluation for sequence models.☆594May 12, 2026Updated 3 weeks ago
- Cluster Toolkit is an open-source software offered by Google Cloud which makes it easy for customers to deploy AI/ML and HPC environments…☆343Jun 1, 2026Updated last week
- Notebooks for managing NeurIPS 2014 and analysing the NeurIPS experiment.☆13May 22, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- An Extensible Deep Learning Library☆2,363May 16, 2026Updated 3 weeks ago
- AppWrapper controller for Kueue☆17May 22, 2026Updated 2 weeks ago
- Tensor Parallelism with JAX + Shard Map☆11Sep 29, 2023Updated 2 years ago
- 🎹 Instruct.KR 2025 Summer Meetup: 오픈소스 LLM, vLLM으로 Production까지 🎹☆23Aug 2, 2025Updated 10 months ago
- This repository is a collection of accelerated platform best practices, reference architectures, example use cases, reference implementat…☆100Updated this week
- torchprime is a reference model implementation for PyTorch on TPU.☆47Mar 3, 2026Updated 3 months ago
- ☆66Updated this week