google / flaxformerLinks

☆364

Alternatives and similar repositories for flaxformer

Users that are interested in flaxformer are comparing it to the libraries listed below

Sorting:

google / praxis
☆190Updated 2 weeks ago
Sea-Snell / JAXSeq
Train very large language models in Jax.
☆210Updated 2 years ago
lucidrains / flash-attention-jax
Implementation of Flash Attention in Jax
☆223Updated last year
google / seqio
Task-based datasets, preprocessing, and evaluation for sequence models.
☆590Updated 3 weeks ago
lucidrains / PaLM-jax
Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)
☆189Updated 3 years ago
ayaka14732 / jax-smi
JAX Synergistic Memory Inspector
☆183Updated last year
google-research / meliad
☆259Updated 6 months ago
huggingface / bloom-jax-inference
☆66Updated 3 years ago
Sea-Snell / JAX_llama
Inference code for LLaMA models in JAX
☆120Updated last year
ayaka14732 / llama-2-jax
JAX implementation of the Llama 2 model
☆216Updated last year
facebookresearch / mega
Sequence modeling with Mega.
☆301Updated 2 years ago
google / paxml
Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimenta…
☆540Updated 3 weeks ago
google-research / jestimator
Amos optimizer with JEstimator lib.
☆82Updated last year
lucidrains / triton-transformer
Implementation of a Transformer, but completely in Triton
☆277Updated 3 years ago
srush / do-we-need-attention
☆166Updated 2 years ago
HazyResearch / H3
Language Modeling with the H3 State Space Model
☆519Updated 2 years ago
kingoflolz / swarm-jax
Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes
☆242Updated 2 years ago
jax-ml / jax-triton
jax-triton contains integrations between JAX and OpenAI Triton
☆436Updated this week
google-deepmind / jmp
JMP is a Mixed Precision library for JAX.
☆211Updated 10 months ago
marin-community / levanter
Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax
☆685Updated last week
facebookresearch / torchdim
Named tensors with first-class dimensions for PyTorch
☆332Updated 2 years ago
davisyoshida / lorax
LoRA for arbitrary JAX models and functions
☆143Updated last year
lucidrains / Mega-pytorch
Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena
☆207Updated 2 years ago
sholtodouglas / scalingExperiments
☆62Updated 3 years ago
lucidrains / simple-hierarchical-transformer
Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
☆224Updated last year
srush / annotated-s4
Implementation of https://srush.github.io/annotated-s4
☆506Updated 5 months ago
tech-srl / RASP
An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"
☆323Updated last year
google-deepmind / pg19
☆250Updated 5 years ago
lucidrains / CoLT5-attention
Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch
☆230Updated last year
SeanNaren / minGPT
A minimal PyTorch Lightning OpenAI GPT w DeepSpeed Training!
☆113Updated 2 years ago