fattorib / transformer_shmap

Tensor Parallelism with JAX + Shard Map
10Updated 11 months ago

Related projects: