xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerators such as TPUs and GPUs on GKE.
☆180Apr 21, 2026Updated last week
Alternatives and similar repositories for xpk
Users that are interested in xpk are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆92Apr 22, 2026Updated last week
- Package of Pathways-on-Cloud utilities☆27Updated this week
- KJob: Tool for CLI-loving ML researchers☆42Mar 31, 2026Updated last month
- ☆35Updated this week
- Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.☆131Updated this week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A simplified and automated orchestration workflow to perform ML end-to-end (E2E) model tests and benchmarking on Cloud VMs across differe…☆61Updated this week
- ☆341Apr 23, 2026Updated last week
- A simple, performant and scalable Jax LLM!☆2,255Updated this week
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆431Jan 5, 2026Updated 3 months ago
- ☆151Apr 23, 2026Updated last week
- ☆576Jul 11, 2024Updated last year
- Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimenta…☆549Apr 23, 2026Updated last week
- AI on GKE is a collection of examples, best-practices, and prebuilt solutions to help build, deploy, and scale AI Platforms on Google Kub…☆327Jun 23, 2025Updated 10 months ago
- Open-Sora: Democratizing Efficient Video Production for All☆19Nov 7, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- JobSet: a k8s native API for distributed ML training and HPC workloads☆323Updated this week
- ☆196Apr 19, 2026Updated last week
- Collection of tools and examples for managing Accelerated workloads in Kubernetes Engine☆251Apr 22, 2026Updated last week
- Orbax provides common checkpointing and persistence utilities for JAX users☆504Updated this week
- Google TPU optimizations for transformers models☆137Jan 23, 2026Updated 3 months ago
- jax-triton contains integrations between JAX and OpenAI Triton☆447Apr 23, 2026Updated last week
- JAX-Toolbox☆404Updated this week
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆82Dec 18, 2025Updated 4 months ago
- Library for reading and processing ML training data.☆717Updated this week
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆21Apr 21, 2026Updated last week
- ☆16May 11, 2025Updated 11 months ago
- Task-based datasets, preprocessing, and evaluation for sequence models.☆593Apr 22, 2026Updated last week
- Cluster Toolkit is an open-source software offered by Google Cloud which makes it easy for customers to deploy AI/ML and HPC environments…☆335Updated this week
- An Extensible Deep Learning Library☆2,349Apr 16, 2026Updated last week
- Notebooks for managing NeurIPS 2014 and analysing the NeurIPS experiment.☆13May 22, 2024Updated last year
- JAX Synergistic Memory Inspector☆187Jul 16, 2024Updated last year
- AppWrapper controller for Kueue☆17Apr 11, 2026Updated 2 weeks ago
- Holistic job manager on Kubernetes☆116Feb 20, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A JAX-native High Performance Eval Metrics Library☆57Apr 13, 2026Updated 2 weeks ago
- ☆49Jan 5, 2026Updated 3 months ago
- Tensor Parallelism with JAX + Shard Map☆11Sep 29, 2023Updated 2 years ago
- Minimal yet performant LLM examples in pure JAX☆250Apr 10, 2026Updated 2 weeks ago
- This repository is a collection of accelerated platform best practices, reference architectures, example use cases, reference implementat…☆93Updated this week
- ☆24Jun 24, 2025Updated 10 months ago
- torchprime is a reference model implementation for PyTorch on TPU.☆47Mar 3, 2026Updated last month