AI-Hypercomputer / cloud-accelerator-diagnosticsLinks
☆21Updated last week
Alternatives and similar repositories for cloud-accelerator-diagnostics
Users that are interested in cloud-accelerator-diagnostics are comparing it to the libraries listed below
Sorting:
- A simple library for scaling up JAX programs☆139Updated 7 months ago
- ☆141Updated 3 weeks ago
- ☆186Updated 3 weeks ago
- xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…☆125Updated this week
- JAX Synergistic Memory Inspector☆174Updated 11 months ago
- ☆126Updated last month
- ☆318Updated this week
- JAX-Toolbox☆311Updated last week
- jax-triton contains integrations between JAX and OpenAI Triton☆400Updated 3 weeks ago
- Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimenta…☆510Updated last week
- Named Tensors for Legible Deep Learning in JAX☆181Updated this week
- Two implementations of ZeRO-1 optimizer sharding in JAX☆14Updated 2 years ago
- seqax = sequence modeling + JAX☆159Updated last week
- Implementation of Flash Attention in Jax☆213Updated last year
- Orbax provides common checkpointing and persistence utilities for JAX users☆393Updated this week
- PyTorch centric eager mode debugger☆47Updated 6 months ago
- PyTorch Single Controller☆231Updated this week
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆62Updated 2 months ago
- Experiment of using Tangent to autodiff triton☆79Updated last year
- Jax/Flax rewrite of Karpathy's nanoGPT☆57Updated 2 years ago
- Distributed pretraining of large language models (LLMs) on cloud TPU slices, with Jax and Equinox.☆24Updated 8 months ago
- Inference code for LLaMA models in JAX☆118Updated last year
- LoRA for arbitrary JAX models and functions☆138Updated last year
- ☆67Updated 2 years ago
- ☆270Updated 11 months ago
- JAX bindings for Flash Attention v2☆89Updated 11 months ago
- ☆87Updated last week
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆253Updated this week
- This repository hosts code that supports the testing infrastructure for the PyTorch organization. For example, this repo hosts the logic …☆94Updated this week
- ☆14Updated last month