AI-Hypercomputer / ml-goodput-measurementLinks

☆19

Alternatives and similar repositories for ml-goodput-measurement

Users that are interested in ml-goodput-measurement are comparing it to the libraries listed below

Sorting:

AI-Hypercomputer / xpk
xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…
☆136Updated this week
google / saxml
☆142Updated last week
AI-Hypercomputer / gpu-recipes
Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.
☆78Updated this week
GoogleCloudPlatform / ml-auto-solutions
A simplified and automated orchestration workflow to perform ML end-to-end (E2E) model tests and benchmarking on Cloud VMs across differe…
☆50Updated this week
AI-Hypercomputer / jetstream-pytorch
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
☆67Updated 4 months ago
jax-ml / jax-tpu-embedding
☆19Updated this week
AI-Hypercomputer / tpu-recipes
☆39Updated 3 weeks ago
sholtodouglas / multihost_dataloading
Experimenting with how best to do multi-host dataloading
☆10Updated 2 years ago
AI-Hypercomputer / kithara
☆14Updated 2 months ago
fattorib / ZeRO-transformer
Two implementations of ZeRO-1 optimizer sharding in JAX
☆14Updated 2 years ago
AI-Hypercomputer / maxdiffusion
☆238Updated last week
google / praxis
☆187Updated last week
AI-Hypercomputer / JetStream
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…
☆365Updated last month
AI-Hypercomputer / torchprime
torchprime is a reference model implementation for PyTorch on TPU.
☆33Updated last week
google / vertex-ai-nas
With Vertex AI NAS, you can search for optimal neural architectures in terms of accuracy, latency, memory, a combination of these, or a c…
☆29Updated last year
thecharlieblake / lovely-llama
An implementation of the Llama architecture, to instruct and delight
☆21Updated 2 months ago
GoogleCloudPlatform / ml-testing-accelerators
Testing framework for Deep Learning models (Tensorflow and PyTorch) on Google Cloud hardware accelerators (TPU and GPU)
☆64Updated last month
google / jaxonnxruntime
A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.
☆118Updated last week
google / init2winit
☆77Updated this week
coreweave / ml-containers
☆38Updated this week
huggingface / optimum-tpu
Google TPU optimizations for transformers models
☆117Updated 6 months ago
google-research / kauldron
Modular, scalable library to train ML models
☆143Updated this week
google / tunix
A JAX-native LLM Post-Training Library
☆84Updated this week
GoogleCloudPlatform / slurm-gcp
☆51Updated 3 weeks ago
foundation-model-stack / fms-acceleration
🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.
☆11Updated last month
google-deepmind / tf2jax
☆115Updated last week
facebookresearch / spdl
Scalable and Performant Data Loading
☆291Updated this week
google / struct2tensor
struct2tensor is a library for parsing and manipulating structured data inside of tensorflow.
☆34Updated 4 months ago
lianakoleva / no-libtorch-compile
☆21Updated 5 months ago
pytorch / test-infra
This repository hosts code that supports the testing infrastructure for the PyTorch organization. For example, this repo hosts the logic …
☆96Updated this week