AI-Hypercomputer / tpu-recipesLinks
☆34Updated this week
Alternatives and similar repositories for tpu-recipes
Users that are interested in tpu-recipes are comparing it to the libraries listed below
Sorting:
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆62Updated 2 months ago
- xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…☆125Updated this week
- ☆141Updated 3 weeks ago
- Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.☆73Updated last week
- Google TPU optimizations for transformers models☆113Updated 5 months ago
- A set of Python scripts that makes your experience on TPU better☆55Updated 11 months ago
- ☆21Updated 3 months ago
- ☆126Updated last month
- Write a fast kernel and run it on Discord. See how you compare against the best!☆46Updated this week
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆349Updated 2 weeks ago
- train with kittens!☆60Updated 8 months ago
- Experiment of using Tangent to autodiff triton☆79Updated last year
- ☆67Updated 2 years ago
- Serialize JAX, Flax, Haiku, or Objax model params with 🤗`safetensors`☆45Updated last year
- Make triton easier☆46Updated last year
- ☆109Updated last year
- A place to store reusable transformer components of my own creation or found on the interwebs☆56Updated last week
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆114Updated this week
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆84Updated last year
- Load compute kernels from the Hub☆191Updated this week
- PyTorch centric eager mode debugger☆47Updated 6 months ago
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)☆119Updated this week
- PyTorch per step fault tolerance (actively under development)☆329Updated this week
- ☆14Updated last month
- Minimal but scalable implementation of large language models in JAX☆35Updated 7 months ago
- Two implementations of ZeRO-1 optimizer sharding in JAX☆14Updated 2 years ago
- Machine Learning eXperiment Utilities☆46Updated last year
- ☆21Updated last week
- PyTorch Single Controller☆231Updated this week
- seqax = sequence modeling + JAX☆159Updated last week