AI-Hypercomputer / cloud-accelerator-diagnosticsLinks

☆24

Alternatives and similar repositories for cloud-accelerator-diagnostics

Users that are interested in cloud-accelerator-diagnostics are comparing it to the libraries listed below

Sorting:

google / paxml
Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimenta…
☆540Updated 3 weeks ago
google / saxml
☆148Updated last month
google / aqt
☆337Updated 2 weeks ago
google / praxis
☆190Updated 2 weeks ago
jax-ml / jax-triton
jax-triton contains integrations between JAX and OpenAI Triton
☆436Updated this week
AI-Hypercomputer / torchprime
torchprime is a reference model implementation for PyTorch on TPU.
☆41Updated last month
jax-ml / jax-llm-examples
Minimal yet performant LLM examples in pure JAX
☆204Updated 2 months ago
marin-community / levanter
Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax
☆685Updated last week
AI-Hypercomputer / jetstream-pytorch
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
☆78Updated 2 months ago
MatX-inc / seqax
seqax = sequence modeling + JAX
☆168Updated 4 months ago
NVIDIA / JAX-Toolbox
JAX-Toolbox
☆364Updated this week
meta-pytorch / torchft
Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)
☆455Updated this week
lucidrains / flash-attention-jax
Implementation of Flash Attention in Jax
☆223Updated last year
foundation-model-stack / fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…
☆271Updated last week
google-deepmind / nanodo
☆285Updated last year
rwitten / HighPerfLLMs2024
☆545Updated last year
meta-pytorch / float8_experimental
This repository contains the experimental PyTorch native float8 training UX
☆226Updated last year
AI-Hypercomputer / JetStream
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…
☆392Updated 5 months ago
ayaka14732 / llama-2-jax
JAX implementation of the Llama 2 model
☆216Updated last year
graphcore-research / unit-scaling
A library for unit scaling in PyTorch
☆132Updated 4 months ago
openxla / tokamax
Tokamax: A GPU and TPU kernel library.
☆122Updated this week
young-geng / scalax
A simple library for scaling up JAX programs
☆144Updated last month
ayaka14732 / jax-smi
JAX Synergistic Memory Inspector
☆183Updated last year
mgmalek / efficient_cross_entropy
☆121Updated last year
lucidrains / triton-transformer
Implementation of a Transformer, but completely in Triton
☆277Updated 3 years ago
google / grain
Library for reading and processing ML training data.
☆611Updated this week
vllm-project / tpu-inference
TPU inference for vLLM, with unified JAX and PyTorch support.
☆178Updated this week
AI-Hypercomputer / maxdiffusion
☆283Updated last week
proger / accelerated-scan
Accelerated First Order Parallel Associative Scan
☆192Updated last year
pytorch / helion
A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
☆658Updated this week