PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
☆82Dec 18, 2025Updated 4 months ago
Alternatives and similar repositories for jetstream-pytorch
Users that are interested in jetstream-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆431Jan 5, 2026Updated 3 months ago
- Google TPU optimizations for transformers models☆138Jan 23, 2026Updated 3 months ago
- Testing framework for Deep Learning models (Tensorflow and PyTorch) on Google Cloud hardware accelerators (TPU and GPU)☆65Mar 11, 2026Updated last month
- ☆20Nov 5, 2025Updated 5 months ago
- ☆18Feb 18, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- This repository contains example code to build models on TPUs☆30Feb 17, 2023Updated 3 years ago
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- ☆57Apr 23, 2024Updated 2 years ago
- MLIR-based partitioning system☆180Updated this week
- ☆22Apr 17, 2026Updated last week
- JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training☆72Apr 8, 2026Updated 3 weeks ago
- ☆16Apr 10, 2022Updated 4 years ago
- A simple, performant and scalable Jax LLM!☆2,255Updated this week
- A simplified and automated orchestration workflow to perform ML end-to-end (E2E) model tests and benchmarking on Cloud VMs across differe…☆61Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- KernelBench v2: Can LLMs Write GPU Kernels? - Benchmark with Torch -> Triton (and more!) problems☆23Jul 4, 2025Updated 9 months ago
- llm201n: neural networks zero to super hero. the bridge from mirograd to tinygrad!☆71Apr 22, 2026Updated last week
- torchprime is a reference model implementation for PyTorch on TPU.☆47Mar 3, 2026Updated last month
- Fast and easy distributed model training examples.☆12Nov 26, 2024Updated last year
- PyTorch distributed training acceleration framework☆55Aug 13, 2025Updated 8 months ago
- Android ORM framework.☆20Jul 1, 2015Updated 10 years ago
- Example of applying CUDA graphs to LLaMA-v2☆11Aug 25, 2023Updated 2 years ago
- ☆28Jun 3, 2024Updated last year
- JAX Implementations of Descript Audio Codec and EnCodec☆35Mar 30, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Implementation of Direct Preference Optimization☆17Jul 17, 2023Updated 2 years ago
- JAX backend for SGL☆267Updated this week
- Minimal yet performant LLM examples in pure JAX☆251Apr 10, 2026Updated 3 weeks ago
- ☆94Updated this week
- ☆29Apr 21, 2026Updated last week
- An IR for efficiently simulating distributed ML computation.☆33Jan 13, 2024Updated 2 years ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆704Jan 26, 2026Updated 3 months ago
- xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…☆180Apr 21, 2026Updated last week
- ☆14Nov 28, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- FP4 MAC Array☆19Apr 14, 2024Updated 2 years ago
- ☆151Apr 23, 2026Updated last week
- ☆20Updated this week
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"☆32Jun 5, 2025Updated 10 months ago
- See details in https://github.com/pytorch/xla/blob/r1.12/torch_xla/distributed/fsdp/README.md☆25Dec 22, 2022Updated 3 years ago
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆287Updated this week
- Lattice combination algorithm to combine inaccurate transcripts with hypothesis lattices☆16Mar 19, 2024Updated 2 years ago