AI-Hypercomputer / torchprimeLinks
torchprime is a reference model implementation for PyTorch on TPU.
β44Updated last month
Alternatives and similar repositories for torchprime
Users that are interested in torchprime are comparing it to the libraries listed below
Sorting:
- Load compute kernels from the Hubβ389Updated last week
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β279Updated 2 months ago
- β124Updated last year
- Two implementations of ZeRO-1 optimizer sharding in JAXβ14Updated 2 years ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β186Updated 2 weeks ago
- A library for unit scaling in PyTorchβ133Updated 6 months ago
- β345Updated last week
- This repository contains the experimental PyTorch native float8 training UXβ227Updated last year
- Scalable and Performant Data Loadingβ364Updated this week
- β92Updated last year
- ring-attention experimentsβ165Updated last year
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"β247Updated 8 months ago
- β16Updated 8 months ago
- β192Updated this week
- β562Updated last year
- TPU inference for vLLM, with unified JAX and PyTorch support.β228Updated this week
- β147Updated this week
- JAX bindings for Flash Attention v2β103Updated last week
- Accelerate, Optimize performance with streamlined training and serving options with JAX.β334Updated last month
- β289Updated last year
- Accelerated First Order Parallel Associative Scanβ196Updated last month
- MoE training for Me and You and maybe other peopleβ335Updated last month
- Google TPU optimizations for transformers modelsβ135Updated 2 weeks ago
- Triton-based implementation of Sparse Mixture of Experts.β263Updated 4 months ago
- β232Updated 2 months ago
- Dion optimizer algorithmβ424Updated 3 weeks ago
- A set of Python scripts that makes your experience on TPU betterβ56Updated 4 months ago
- β150Updated 2 years ago
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.β334Updated 3 months ago
- seqax = sequence modeling + JAXβ170Updated 6 months ago