PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
☆84Dec 18, 2025Updated 6 months ago
Alternatives and similar repositories for jetstream-pytorch
Users that are interested in jetstream-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆447Jan 5, 2026Updated 5 months ago
- Google TPU optimizations for transformers models☆136Jan 23, 2026Updated 5 months ago
- ☆21Apr 27, 2026Updated 2 months ago
- torchax is a PyTorch frontend for JAX. It gives JAX the ability to author JAX programs using familiar PyTorch syntax. It also provides JA…☆230Jun 17, 2026Updated last week
- This repository contains example code to build models on TPUs☆30Feb 17, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- ☆56Apr 23, 2024Updated 2 years ago
- MLIR-based partitioning system☆191Updated this week
- JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training☆78Jun 18, 2026Updated last week
- ☆16Apr 10, 2022Updated 4 years ago
- A simple, performant and scalable Jax LLM!☆2,338Updated this week
- The Structure and Interpretation of Tensor Programs: The Hacker's Accelerated Introduction to Deep Learning and Deep Learning Systems☆80Updated this week
- KernelBench v2: Can LLMs Write GPU Kernels? - Benchmark with Torch -> Triton (and more!) problems☆24Jul 4, 2025Updated 11 months ago
- torchprime is a reference model implementation for PyTorch on TPU.☆48Mar 3, 2026Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Fast and easy distributed model training examples.☆12Nov 26, 2024Updated last year
- PyTorch distributed training acceleration framework☆56Aug 13, 2025Updated 10 months ago
- Example of applying CUDA graphs to LLaMA-v2☆11Aug 25, 2023Updated 2 years ago
- ☆34Jun 3, 2024Updated 2 years ago
- JAX Implementations of Descript Audio Codec and EnCodec☆37Mar 30, 2025Updated last year
- Implementation of Direct Preference Optimization☆17Jul 17, 2023Updated 2 years ago
- JAX backend for SGL☆295Updated this week
- Minimal yet performant LLM examples in pure JAX☆261Apr 10, 2026Updated 2 months ago
- ☆31Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- An IR for efficiently simulating distributed ML computation.☆33Jan 13, 2024Updated 2 years ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆706Jan 26, 2026Updated 5 months ago
- ☆14Nov 28, 2022Updated 3 years ago
- Getting confidences from any end-to-end systems☆11May 24, 2023Updated 3 years ago
- ☆153Jun 24, 2026Updated last week
- ☆20May 30, 2026Updated last month
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"☆32Jun 5, 2025Updated last year
- See details in https://github.com/pytorch/xla/blob/r1.12/torch_xla/distributed/fsdp/README.md☆25Dec 22, 2022Updated 3 years ago
- 한국어 언어 모델 학습을 위한 프로젝트(Flax, Pytorch with Huggingface Accelerate)☆32Sep 13, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Lattice combination algorithm to combine inaccurate transcripts with hypothesis lattices☆16Mar 19, 2024Updated 2 years ago
- Java tool to translate VRP instances to VRP-REP unified format.☆11Nov 28, 2014Updated 11 years ago
- ☆34May 14, 2025Updated last year
- ☆13Jul 16, 2024Updated last year
- Script for bundling Common Voice (https://commonvoice.mozilla.org/) clips by language☆11Apr 13, 2023Updated 3 years ago
- Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.☆133Jun 8, 2026Updated 3 weeks ago
- Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimenta…☆556Jun 4, 2026Updated 3 weeks ago