AI-Hypercomputer / torchprimeLinks
torchprime is a reference model implementation for PyTorch on TPU.
☆22Updated this week
Alternatives and similar repositories for torchprime
Users that are interested in torchprime are comparing it to the libraries listed below
Sorting:
- Google TPU optimizations for transformers models☆112Updated 4 months ago
- ☆138Updated 2 weeks ago
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆60Updated 2 months ago
- Load compute kernels from the Hub☆144Updated this week
- ☆108Updated last year
- Various transformers for FSDP research☆37Updated 2 years ago
- Pytorch/XLA SPMD Test code in Google TPU☆23Updated last year
- ☆13Updated 2 weeks ago
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆105Updated 2 months ago
- ☆20Updated last year
- Applied AI experiments and examples for PyTorch☆271Updated last week
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS☆183Updated last month
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆185Updated this week
- Accelerate, Optimize performance with streamlined training and serving options with JAX.☆274Updated this week
- Two implementations of ZeRO-1 optimizer sharding in JAX☆14Updated last year
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆249Updated this week
- A set of Python scripts that makes your experience on TPU better☆54Updated 11 months ago
- This repository contains the experimental PyTorch native float8 training UX☆223Updated 10 months ago
- Cataloging released Triton kernels.☆229Updated 4 months ago
- ☆169Updated 5 months ago
- ☆186Updated this week
- Fast, Modern, Memory Efficient, and Low Precision PyTorch Optimizers☆94Updated 10 months ago
- ☆215Updated this week
- Inference code for LLaMA models in JAX☆118Updated last year
- Collection of kernels written in Triton language☆125Updated 2 months ago
- Implementation of Flash Attention in Jax☆212Updated last year
- ☆157Updated last year
- ☆190Updated 3 months ago
- A tool to configure, launch and manage your machine learning experiments.☆153Updated this week
- Learn CUDA with PyTorch☆21Updated this week