PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
☆79Dec 18, 2025Updated 3 months ago
Alternatives and similar repositories for jetstream-pytorch
Users that are interested in jetstream-pytorch are comparing it to the libraries listed below
Sorting:
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆416Jan 5, 2026Updated 2 months ago
- Google TPU optimizations for transformers models☆136Jan 23, 2026Updated last month
- Testing framework for Deep Learning models (Tensorflow and PyTorch) on Google Cloud hardware accelerators (TPU and GPU)☆65Mar 11, 2026Updated last week
- ☆19Nov 5, 2025Updated 4 months ago
- torchax is a PyTorch frontend for JAX. It gives JAX the ability to author JAX programs using familiar PyTorch syntax. It also provides JA…☆196Updated this week
- a Jax quantization library☆100Updated this week
- ☆16Feb 18, 2026Updated last month
- This repository contains example code to build models on TPUs☆30Feb 17, 2023Updated 3 years ago
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- ☆57Apr 23, 2024Updated last year
- ☆19Oct 6, 2023Updated 2 years ago
- MLIR-based partitioning system☆173Mar 14, 2026Updated last week
- ☆21Jan 21, 2026Updated 2 months ago
- JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training☆68Mar 11, 2026Updated last week
- a teaching deep learning framework: the bridge from micrograd to tinygrad☆61Mar 14, 2026Updated last week
- ☆16Apr 10, 2022Updated 3 years ago
- A simple, performant and scalable Jax LLM!☆2,170Updated this week
- KernelBench v2: Can LLMs Write GPU Kernels? - Benchmark with Torch -> Triton (and more!) problems☆22Jul 4, 2025Updated 8 months ago
- torchprime is a reference model implementation for PyTorch on TPU.☆46Mar 3, 2026Updated 2 weeks ago
- Fast and easy distributed model training examples.☆12Nov 26, 2024Updated last year
- PyTorch distributed training acceleration framework☆54Aug 13, 2025Updated 7 months ago
- Example of applying CUDA graphs to LLaMA-v2☆12Aug 25, 2023Updated 2 years ago
- ☆28Jun 3, 2024Updated last year
- JAX backend for SGL☆250Updated this week
- JAX Implementations of Descript Audio Codec and EnCodec☆34Mar 30, 2025Updated 11 months ago
- Implementation of Direct Preference Optimization☆17Jul 17, 2023Updated 2 years ago
- ☆79Updated this week
- Minimal yet performant LLM examples in pure JAX☆245Jan 14, 2026Updated 2 months ago
- Julia package for Probabilistic Canonical Correlation Analysis☆12Mar 30, 2022Updated 3 years ago
- ☆27Oct 26, 2024Updated last year
- ☆27Updated this week
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆698Jan 26, 2026Updated last month
- xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…☆171Mar 13, 2026Updated last week
- ☆14Nov 28, 2022Updated 3 years ago
- Getting confidences from any end-to-end systems☆11May 24, 2023Updated 2 years ago
- ☆150Feb 26, 2026Updated 3 weeks ago
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"☆31Jun 5, 2025Updated 9 months ago
- See details in https://github.com/pytorch/xla/blob/r1.12/torch_xla/distributed/fsdp/README.md☆25Dec 22, 2022Updated 3 years ago
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆271Updated this week