AI-Hypercomputer / tpu-recipes
☆27Updated last week
Alternatives and similar repositories for tpu-recipes
Users that are interested in tpu-recipes are comparing it to the libraries listed below
Sorting:
- Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.☆63Updated this week
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆60Updated last month
- ☆138Updated 2 weeks ago
- xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…☆120Updated last week
- ☆20Updated last year
- Google TPU optimizations for transformers models☆109Updated 3 months ago
- Train, tune, and infer Bamba model☆124Updated 2 weeks ago
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆325Updated this week
- Load compute kernels from the Hub☆119Updated last week
- ☆186Updated 2 weeks ago
- seqax = sequence modeling + JAX☆155Updated last month
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)☆93Updated last week
- ☆109Updated this week
- ☆43Updated last year
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆60Updated 6 months ago
- Testing framework for Deep Learning models (Tensorflow and PyTorch) on Google Cloud hardware accelerators (TPU and GPU)☆64Updated 5 months ago
- train with kittens!☆57Updated 6 months ago
- Checkpointable dataset utilities for foundation model training☆32Updated last year
- ☆106Updated 11 months ago
- Docker image NVIDIA GH200 machines - optimized for vllm serving and hf trainer finetuning☆40Updated 2 months ago
- extensible collectives library in triton☆86Updated last month
- some common Huggingface transformers in maximal update parametrization (µP)☆80Updated 3 years ago
- ☆21Updated 2 months ago
- ☆67Updated 2 years ago
- Dolomite Engine is a library for pretraining/finetuning LLMs☆53Updated this week
- Various transformers for FSDP research☆37Updated 2 years ago
- Experiment of using Tangent to autodiff triton☆78Updated last year
- The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS …☆59Updated 7 months ago
- A set of Python scripts that makes your experience on TPU better☆53Updated 10 months ago
- Serialize JAX, Flax, Haiku, or Objax model params with 🤗`safetensors`☆44Updated 11 months ago