sholtodouglas / minformer

Minimal transformer for arbtirary data (i.e. bio stuff!)

☆21

Alternatives and similar repositories for minformer:

Users that are interested in minformer are comparing it to the libraries listed below

MatX-inc / seqax
seqax = sequence modeling + JAX
☆143Updated 7 months ago
xjdr-alt / simple_transformer
Simple Transformer in Jax
☆136Updated 8 months ago
young-geng / mintext
Minimal but scalable implementation of large language models in JAX
☆32Updated 3 months ago
google-deepmind / nanodo
☆211Updated 7 months ago
young-geng / scalax
A simple library for scaling up JAX programs
☆129Updated 3 months ago
dshah3 / GPU-Puzzles
Solve puzzles. Learn CUDA.
☆62Updated last year
yixiaoer / tpux
A set of Python scripts that makes your experience on TPU better
☆48Updated 7 months ago
Sea-Snell / JAXSeq
Train very large language models in Jax.
☆203Updated last year
kvfrans / jax-diffusion-transformer
Implementation of Diffusion Transformer (DiT) in JAX
☆265Updated 8 months ago
athms / mad-lab
A MAD laboratory to improve AI architecture designs 🧪
☆103Updated 2 months ago
ethansmith2000 / fsdp_optimizers
supporting pytorch FSDP for optimizers
☆76Updated 2 months ago
EleutherAI / nanoGPT-mup
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆95Updated 3 months ago
evanatyourservice / psgd_jax
Implementation of PSGD optimizer in JAX
☆28Updated last month
cloneofsimo / scaling-guide
WIP
☆93Updated 6 months ago
clement-bonnet / lpn
Latent Program Network (from the "Searching Latent Program Spaces" paper)
☆59Updated 2 months ago
nshepperd / flash_attn_jax
JAX bindings for Flash Attention v2
☆85Updated 7 months ago
rwitten / HighPerfLLMs2024
☆377Updated 7 months ago
imbue-ai / carbs
Cost aware hyperparameter tuning algorithm
☆143Updated 7 months ago
siboehm / ShallowSpeed
Small scale distributed training of sequential deep learning models, built on Numpy and MPI.
☆118Updated last year
davisyoshida / lorax
LoRA for arbitrary JAX models and functions
☆135Updated 11 months ago
srush / GPTWorld
A puzzle to learn about prompting
☆124Updated last year
sholtodouglas / scalingExperiments
☆58Updated 2 years ago
jenkspt / gpt-jax
Jax/Flax rewrite of Karpathy's nanoGPT
☆56Updated 2 years ago
cloneofsimo / min-max-gpt
Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training
☆122Updated 10 months ago
Reytuag / transformerXL_PPO_JAX
☆73Updated 3 months ago
balrog-ai / BALROG
Benchmarking Agentic LLM and VLM Reasoning On Games
☆115Updated this week
Jaykef / Triton-nanoGPT
Custom triton kernels for training Karpathy's nanoGPT.
☆16Updated 4 months ago
ayaka14732 / llama-2-jax
JAX implementation of the Llama 2 model
☆215Updated last year
facebookresearch / minimax
Efficient baselines for autocurricula in JAX.
☆179Updated 5 months ago