young-geng / tpu_pod_commanderLinks
TPU pod commander is a package for managing and launching jobs on Google Cloud TPU pods.
☆20Updated last year
Alternatives and similar repositories for tpu_pod_commander
Users that are interested in tpu_pod_commander are comparing it to the libraries listed below
Sorting:
- Minimal but scalable implementation of large language models in JAX☆35Updated 2 weeks ago
- If it quacks like a tensor...☆58Updated 8 months ago
- A simple library for scaling up JAX programs☆140Updated 9 months ago
- LoRA for arbitrary JAX models and functions☆140Updated last year
- A simple, performant and scalable JAX-based world modeling codebase☆58Updated this week
- ☆31Updated 8 months ago
- ☆115Updated 2 months ago
- Train very large language models in Jax.☆206Updated last year
- General Modules for JAX☆67Updated 4 months ago
- Platform to run interactive Reinforcement Learning agents in a Minecraft Server☆53Updated last year
- ☆19Updated 2 years ago
- Implementation of PSGD optimizer in JAX☆34Updated 7 months ago
- Machine Learning eXperiment Utilities☆46Updated last week
- Accelerated replay buffers in JAX☆43Updated 2 years ago
- A collection of meta-learning algorithms in Jax☆23Updated 2 years ago
- Building blocks for productive research☆59Updated last week
- flexible meta-learning in jax☆14Updated last year
- ☆13Updated last year
- JAX Synergistic Memory Inspector☆177Updated last year
- ☆81Updated 9 months ago
- Maximal Update Parametrization (μP) with Flax & Optax.☆16Updated last year
- CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL☆114Updated 11 months ago
- seqax = sequence modeling + JAX☆165Updated 2 weeks ago
- ☆61Updated 3 years ago
- Code for Powderworld: A Platform for Understanding Generalization via Rich Task Distributions☆68Updated 11 months ago
- A set of Python scripts that makes your experience on TPU better☆54Updated last year
- Tools and Utils for Experiments (TUX)☆16Updated 6 months ago
- Scaling scaling laws with board games.☆52Updated 2 years ago
- ☆51Updated last year
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆130Updated last year