AakashKumarNain / mistral_jaxLinks

This is a port of Mistral-7B model in JAX

☆32

Alternatives and similar repositories for mistral_jax

Users that are interested in mistral_jax are comparing it to the libraries listed below

Sorting:

cgarciae / ciclo
A functional training loops library for JAX
☆88Updated last year
davisyoshida / lorax
LoRA for arbitrary JAX models and functions
☆143Updated last year
phlippe / jax_trainer
Lightning-like training API for JAX with Flax
☆44Updated 11 months ago
cgarciae / nnx
Neural Networks for JAX
☆84Updated last year
google-deepmind / tf2jax
☆118Updated 3 weeks ago
ml-gde / jflux
JAX Implementation of Black Forest Labs' Flux.1 family of models
☆39Updated 2 weeks ago
young-geng / scalax
A simple library for scaling up JAX programs
☆144Updated last month
lucidrains / flash-attention-jax
Implementation of Flash Attention in Jax
☆222Updated last year
ludwigwinkler / JaxLightning
Running Jax in PyTorch Lightning
☆114Updated 11 months ago
JesseFarebro / flax-mup
Maximal Update Parametrization (μP) with Flax & Optax.
☆16Updated last year
google-deepmind / jmp
JMP is a Mixed Precision library for JAX.
☆211Updated 10 months ago
srush / triton-autodiff
Experiment of using Tangent to autodiff triton
☆80Updated last year
Artur-Galstyan / statedict2pytree
☆44Updated last month
crowsonkb / jax-wavelets
The 2D discrete wavelet transform for JAX
☆44Updated 2 years ago
vvvm23 / mamba-jax
Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX
☆92Updated last year
samuela / torch2jax
Run PyTorch in JAX. 🤝
☆307Updated last month
lucidrains / frame-averaging-pytorch
Pytorch implementation of a simple way to enable (Stochastic) Frame Averaging for any network
☆51Updated last year
kvfrans / jax-flow
Flow-matching algorithms in JAX
☆111Updated last year
DarshanDeshpande / jax-models
Unofficial JAX implementations of deep learning research papers
☆159Updated 3 years ago
sholtodouglas / scalingExperiments
☆62Updated 3 years ago
yixiaoer / einshard
Einsum-like high-level array sharding API for JAX
☆34Updated last year
kvfrans / splus
☆119Updated 5 months ago
lucidrains / GAF-microbatch-pytorch
Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch
☆25Updated 10 months ago
BirkhoffG / jax-dataloader
Pytorch-like dataloaders for JAX.
☆97Updated 6 months ago
openxla / tokamax
Tokamax: A GPU and TPU kernel library.
☆122Updated this week
lixilinx / psgd_torch
Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…
☆188Updated last month
HenryNdubuaku / nanodl
A Jax-based library for building transformers, includes implementations of GPT, Gemma, LlaMa, Mixtral, Whisper, SWin, ViT and more.
☆297Updated last year
cloneofsimo / min-fsdp
☆91Updated last year
apple / ml-ademamix
☆68Updated last year
cgarciae / einop
☆60Updated 3 years ago