rwitten / HighPerfLLMs2024
☆267Updated 6 months ago
Alternatives and similar repositories for HighPerfLLMs2024:
Users that are interested in HighPerfLLMs2024 are comparing it to the libraries listed below
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆534Updated this week
- seqax = sequence modeling + JAX☆136Updated 6 months ago
- jax-triton contains integrations between JAX and OpenAI Triton☆363Updated this week
- ☆201Updated 6 months ago
- JAX implementation of the Llama 2 model☆213Updated 11 months ago
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆505Updated 2 months ago
- Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimenta…☆470Updated this week
- JAX-Toolbox☆277Updated this week
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆266Updated this week
- Building blocks for foundation models.☆435Updated last year
- ☆126Updated this week
- ☆181Updated 3 weeks ago
- ☆170Updated this week
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.☆114Updated last year
- Cataloging released Triton kernels.☆155Updated last week
- Annotated version of the Mamba paper☆469Updated 10 months ago
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆215Updated this week
- A puzzle to learn about prompting☆123Updated last year
- ☆275Updated this week
- Puzzles for exploring transformers☆331Updated last year
- This repository contains the experimental PyTorch native float8 training UX☆219Updated 5 months ago
- Solve puzzles. Learn CUDA.☆61Updated last year
- Inference code for LLaMA models in JAX☆114Updated 7 months ago
- A simple library for scaling up JAX programs☆129Updated 2 months ago
- ring-attention experiments☆116Updated 3 months ago
- Minimalistic 4D-parallelism distributed training framework for education purpose☆644Updated this week
- Cost aware hyperparameter tuning algorithm☆137Updated 6 months ago
- ☆138Updated 11 months ago
- Applied AI experiments and examples for PyTorch☆211Updated this week
- What would you do with 1000 H100s...☆948Updated last year