graphcore-research / unit-scaling
A library for unit scaling in PyTorch
☆125Updated 5 months ago
Alternatives and similar repositories for unit-scaling:
Users that are interested in unit-scaling are comparing it to the libraries listed below
- This repository contains the experimental PyTorch native float8 training UX☆224Updated 9 months ago
- Experiment of using Tangent to autodiff triton☆78Updated last year
- ☆103Updated 11 months ago
- supporting pytorch FSDP for optimizers☆80Updated 4 months ago
- ☆78Updated 10 months ago
- A simple library for scaling up JAX programs☆134Updated 6 months ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆45Updated 9 months ago
- Accelerated First Order Parallel Associative Scan☆182Updated 8 months ago
- ☆157Updated last year
- seqax = sequence modeling + JAX☆155Updated 3 weeks ago
- Triton-based implementation of Sparse Mixture of Experts.☆212Updated 5 months ago
- JAX bindings for Flash Attention v2☆89Updated 9 months ago
- ☆143Updated last year
- LoRA for arbitrary JAX models and functions☆136Updated last year
- ☆297Updated this week
- extensible collectives library in triton☆85Updated last month
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆120Updated this week
- Understand and test language model architectures on synthetic tasks.☆194Updated last month
- Fast, Modern, Memory Efficient, and Low Precision PyTorch Optimizers☆92Updated 9 months ago
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"☆232Updated 2 months ago
- ☆224Updated 2 months ago
- ☆104Updated 8 months ago
- Applied AI experiments and examples for PyTorch☆262Updated last week
- ☆217Updated 9 months ago
- ☆52Updated 7 months ago
- JMP is a Mixed Precision library for JAX.☆196Updated 3 months ago
- Efficient optimizers☆190Updated this week
- ☆177Updated 5 months ago
- The simplest but fast implementation of matrix multiplication in CUDA.☆34Updated 9 months ago
- ☆81Updated last year