AI-Hypercomputer / JetStreamLinks
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
☆354Updated last month
Alternatives and similar repositories for JetStream
Users that are interested in JetStream are comparing it to the libraries listed below
Sorting:
- ☆142Updated this week
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆64Updated 3 months ago
- Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimenta…☆513Updated this week
- xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…☆129Updated this week
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆359Updated last week
- Module, Model, and Tensor Serialization/Deserialization☆248Updated this week
- ☆214Updated 5 months ago
- ☆320Updated 2 weeks ago
- ☆511Updated last year
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆187Updated this week
- PyTorch Single Controller☆318Updated this week
- ☆230Updated this week
- ☆310Updated 10 months ago
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆205Updated this week
- ☆228Updated this week
- Google TPU optimizations for transformers models☆114Updated 5 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆264Updated 9 months ago
- A library to analyze PyTorch traces.☆391Updated this week
- TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…☆369Updated last week
- A tool to configure, launch and manage your machine learning experiments.☆169Updated this week
- Applied AI experiments and examples for PyTorch☆281Updated last month
- Perplexity GPU Kernels☆395Updated last month
- ☆186Updated last month
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆255Updated this week
- Fast low-bit matmul kernels in Triton☆327Updated this week
- KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems☆468Updated this week
- Home for OctoML PyTorch Profiler☆113Updated 2 years ago
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆190Updated this week
- CUDA checkpoint and restore utility☆345Updated 5 months ago
- ☆56Updated 9 months ago