AI-Hypercomputer / maxdiffusionLinks
☆274Updated this week
Alternatives and similar repositories for maxdiffusion
Users that are interested in maxdiffusion are comparing it to the libraries listed below
Sorting:
- Google TPU optimizations for transformers models☆122Updated 9 months ago
- Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimenta…☆539Updated 2 months ago
- ☆145Updated last week
- a Jax quantization library☆57Updated this week
- JAX-Toolbox☆359Updated this week
- JAX implementation of the Llama 2 model☆215Updated last year
- ☆310Updated last year
- ☆91Updated last year
- Scalable and Performant Data Loading☆335Updated this week
- Dion optimizer algorithm☆383Updated this week
- ☆190Updated 3 weeks ago
- Load compute kernels from the Hub☆326Updated this week
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆388Updated 5 months ago
- ☆68Updated last year
- ☆337Updated last week
- Modular, scalable library to train ML models☆170Updated this week
- supporting pytorch FSDP for optimizers☆83Updated 11 months ago
- Minimal yet performant LLM examples in pure JAX☆198Updated last month
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆446Updated last week
- Efficient optimizers☆276Updated 3 weeks ago
- jax-triton contains integrations between JAX and OpenAI Triton☆433Updated last month
- ☆285Updated last year
- JAX Implementation of Black Forest Labs' Flux.1 family of models☆39Updated 2 months ago
- A FlashAttention implementation for JAX with support for efficient document mask computation and context parallelism.☆149Updated 7 months ago
- A simple library for scaling up JAX programs☆144Updated last week
- Focused on fast experimentation and simplicity☆75Updated 10 months ago
- Implementation of Flash Attention in Jax☆220Updated last year
- Implementation of Diffusion Transformer (DiT) in JAX☆294Updated last year
- A library for unit scaling in PyTorch☆132Updated 4 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Updated last year