fattorib / transformer_shmapLinks

Tensor Parallelism with JAX + Shard Map

☆11

Alternatives and similar repositories for transformer_shmap

Users that are interested in transformer_shmap are comparing it to the libraries listed below

Sorting:

rolandgvc / flaxvision
A selection of neural network models ported from torchvision for JAX & Flax.
☆44Updated 4 years ago
google-deepmind / jmp
JMP is a Mixed Precision library for JAX.
☆206Updated 5 months ago
irhum / esmjax
ESM2 protein language models in JAX/Flax
☆17Updated 2 years ago
arogozhnikov / adamw_bfloat16
AdamW optimizer for bfloat16 models in pytorch 🔥.
☆33Updated last year
davisyoshida / lorax
LoRA for arbitrary JAX models and functions
☆140Updated last year
AakashKumarNain / mistral_jax
This is a port of Mistral-7B model in JAX
☆32Updated last year
graphcore-research / out-of-the-box-fp8-training
Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.
☆46Updated last year
google-deepmind / dks
Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural netwo…
☆71Updated 2 weeks ago
drisspg / transformer_nuggets
A place to store reusable transformer components of my own creation or found on the interwebs
☆56Updated 2 weeks ago
srush / Tensor-Puzzles-Penzai
☆20Updated last year
jax-ml / jax-llm-examples
☆132Updated 2 weeks ago
vvvm23 / mamba-jax
Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX
☆84Updated last year
davisyoshida / qax
If it quacks like a tensor...
☆58Updated 8 months ago
srush / triton-autodiff
Experiment of using Tangent to autodiff triton
☆79Updated last year
proger / nanokitchen
Parallel Associative Scan for Language Models
☆18Updated last year
phlippe / jax_trainer
Lightning-like training API for JAX with Flax
☆42Updated 7 months ago
graphcore-research / unit-scaling
A library for unit scaling in PyTorch
☆125Updated this week
yixiaoer / mistral-v0.2-jax
JAX implementation of the Mistral 7b v0.2 model
☆35Updated last year
davisyoshida / haiku-mup
A port of muP to JAX/Haiku
☆25Updated 2 years ago
yixiaoer / einshard
Einsum-like high-level array sharding API for JAX
☆35Updated last year
sholtodouglas / scalingExperiments
☆61Updated 3 years ago
dlwh / jax_sourceror
Turn jitted jax functions back into python source code
☆22Updated 7 months ago
nestordemeure / jochastic
A JAX implementation of stochastic addition.
☆14Updated 2 years ago
ahennequ / pytorch-custom-mma
☆29Updated 2 years ago
joschu / jax-exp
☆24Updated 6 years ago
young-geng / mintext
Minimal but scalable implementation of large language models in JAX
☆35Updated 2 weeks ago
lucidrains / triangle-multiplicative-module
Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or columns of a 2d feature map, …
☆35Updated 3 years ago
n2cholas / jax-resnet
Implementations and checkpoints for ResNet, Wide ResNet, ResNeXt, ResNet-D, and ResNeSt in JAX (Flax).
☆112Updated 3 years ago
lixilinx / psgd_torch
Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…
☆179Updated last month
ludwigwinkler / JaxLightning
Running Jax in PyTorch Lightning
☆106Updated 7 months ago