young-geng / mlxuLinks

Machine Learning eXperiment Utilities

☆46

Alternatives and similar repositories for mlxu

Users that are interested in mlxu are comparing it to the libraries listed below

Sorting:

davisyoshida / lorax
LoRA for arbitrary JAX models and functions
☆143Updated last year
young-geng / scalax
A simple library for scaling up JAX programs
☆144Updated last month
microsoft / mutransformers
some common Huggingface transformers in maximal update parametrization (µP)
☆87Updated 3 years ago
vvvm23 / mamba-jax
Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX
☆92Updated last year
Sea-Snell / JAXSeq
Train very large language models in Jax.
☆210Updated 2 years ago
yixiaoer / mistral-v0.2-jax
JAX implementation of the Mistral 7b v0.2 model
☆35Updated last year
Sea-Snell / JAX_llama
Inference code for LLaMA models in JAX
☆120Updated last year
yixiaoer / tpux
A set of Python scripts that makes your experience on TPU better
☆54Updated 2 months ago
young-geng / mintext
Minimal but scalable implementation of large language models in JAX
☆35Updated last week
sustcsonglin / mamba-triton
☆50Updated last year
davisyoshida / qax
If it quacks like a tensor...
☆59Updated last year
ayaka14732 / llama-2-jax
JAX implementation of the Llama 2 model
☆216Updated last year
ayaka14732 / jax-smi
JAX Synergistic Memory Inspector
☆183Updated last year
nshepperd / flash_attn_jax
JAX bindings for Flash Attention v2
☆99Updated last month
berlino / seq_icl
☆53Updated last year
radarFudan / mamba-minimal-jax
☆35Updated last year
proger / hippogriff
Griffin MQA + Hawk Linear RNN Hybrid
☆89Updated last year
cat-state / tinypar
☆20Updated 2 years ago
sholtodouglas / scalingExperiments
☆62Updated 3 years ago
cloneofsimo / min-max-gpt
Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training
☆132Updated last year
erfanzar / jax-flash-attn2
A flexible and efficient implementation of Flash Attention 2.0 for JAX, supporting multiple backends (GPU/TPU/CPU) and platforms (Triton/…
☆30Updated 9 months ago
lucaslingle / mu_transformer
Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.
☆32Updated 6 months ago
ClashLuke / tpucare
Automatically take good care of your preemptible TPUs
☆37Updated 2 years ago
dvruette / barrel-rec-pytorch
☆53Updated last year
EleutherAI / nanoGPT-mup
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆174Updated 5 months ago
thecharlieblake / lovely-llama
An implementation of the Llama architecture, to instruct and delight
☆21Updated 6 months ago
srush / mamba-primer
☆38Updated last year
Edward-Sun / gpt-accelera
Simple and efficient pytorch-native transformer training and inference (batched)
☆79Updated last year
lucidrains / flash-attention-jax
Implementation of Flash Attention in Jax
☆223Updated last year
srush / do-we-need-attention
☆166Updated 2 years ago