AI-Hypercomputer / maxtextLinks

A simple, performant and scalable Jax LLM!

☆1,848

Alternatives and similar repositories for maxtext

Users that are interested in maxtext are comparing it to the libraries listed below

Sorting:

AnswerDotAI / fsdp_qlora
Training LLMs with QLoRA + FSDP
☆1,524Updated 8 months ago
huggingface / optimum-nvidia
☆988Updated 5 months ago
google / paxml
Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimenta…
☆517Updated this week
Lightning-AI / lightning-thunder
PyTorch compiler that accelerates training and inference. Get built-in optimizations for performance, memory, parallelism, and easily wri…
☆1,384Updated this week
myshell-ai / JetMoE
Reaching LLaMA2 Performance with 0.1M Dollars
☆983Updated last year
pytorch / torchtitan
A PyTorch native platform for training generative AI models
☆4,125Updated this week
Vahe1994 / AQLM
Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.p…
☆1,278Updated 2 months ago
apple / axlearn
An Extensible Deep Learning Library
☆2,215Updated this week
MDK8888 / GPTFast
Accelerate your Hugging Face Transformers 7.6-9x. Native to Hugging Face and PyTorch.
☆685Updated 11 months ago
google-deepmind / penzai
A JAX research toolkit for building, editing, and visualizing neural networks.
☆1,805Updated last month
HazyResearch / ThunderKittens
Tile primitives for speedy kernels
☆2,541Updated this week
pytorch / ao
PyTorch native quantization and sparsity for training and inference
☆2,219Updated this week
rwitten / HighPerfLLMs2024
☆516Updated last year
google-deepmind / recurrentgemma
Open weights language model from Google DeepMind, based on Griffin.
☆645Updated last month
jiaweizzhao / GaLore
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
☆1,579Updated 9 months ago
AI-Hypercomputer / JetStream
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…
☆364Updated last month
pytorch-labs / gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
☆6,036Updated 3 months ago
google-ai-edge / model-explorer
A modern model graph visualizer and debugger
☆1,288Updated last week
huggingface / nanotron
Minimalistic large language model 3D-parallelism training
☆2,068Updated 3 weeks ago
likejazz / llama3.np
llama3.np is a pure NumPy implementation for Llama 3 model.
☆987Updated 3 months ago
facebookresearch / llm-transparency-tool
LLM Transparency Tool (LLM-TT), an open-source interactive toolkit for analyzing internal workings of Transformer-based language models. …
☆826Updated 7 months ago
srush / Triton-Puzzles
Puzzles for learning Triton
☆1,801Updated 8 months ago
intel / intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…
☆2,168Updated 9 months ago
openxla / xla
A machine learning compiler for GPUs, CPUs, and ML accelerators
☆3,383Updated this week
facebookresearch / schedule_free
Schedule-Free Optimization in PyTorch
☆2,193Updated 2 months ago
Liuhong99 / Sophia
The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
☆965Updated last year
EleutherAI / cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
☆808Updated 2 weeks ago
stanford-crfm / levanter
Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax
☆627Updated this week
kyegomez / BitNet
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
☆1,865Updated last week
srush / LLM-Training-Puzzles
What would you do with 1000 H100s...
☆1,068Updated last year