kvfrans / jax-diffusion-transformerLinks
Implementation of Diffusion Transformer (DiT) in JAX
☆293Updated last year
Alternatives and similar repositories for jax-diffusion-transformer
Users that are interested in jax-diffusion-transformer are comparing it to the libraries listed below
Sorting:
- ☆283Updated last year
- For optimization algorithm research and development.☆542Updated last week
- Minimal yet performant LLM examples in pure JAX☆193Updated last month
- Annotated version of the Mamba paper☆490Updated last year
- Efficient optimizers☆276Updated 3 weeks ago
- Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI☆291Updated 5 months ago
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds☆321Updated 3 months ago
- UNet diffusion model in pure CUDA☆651Updated last year
- 🧱 Modula software package☆300Updated 2 months ago
- ☆120Updated 4 months ago
- A simple library for scaling up JAX programs☆144Updated this week
- Dion optimizer algorithm☆379Updated last week
- The Tensor (or Array)☆452Updated last year
- A Jax-based library for building transformers, includes implementations of GPT, Gemma, LlaMa, Mixtral, Whisper, SWin, ViT and more.☆296Updated last year
- ☆221Updated 11 months ago
- Universal Notation for Tensor Operations in Python.☆447Updated 7 months ago
- supporting pytorch FSDP for optimizers☆83Updated 11 months ago
- seqax = sequence modeling + JAX☆168Updated 3 months ago
- An implementation of PSGD Kron second-order optimizer for PyTorch☆96Updated 3 months ago
- ☆310Updated last year
- ☆176Updated last year
- ☆89Updated last year
- A simple implimentation of Bayesian Flow Networks (BFN)☆240Updated last year
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.☆301Updated last week
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆680Updated this week
- Flow-matching algorithms in JAX☆106Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆171Updated 4 months ago
- Simple and readable code for training and sampling from diffusion models☆647Updated 4 months ago
- Normalized Transformer (nGPT)☆192Updated 11 months ago
- WIP☆93Updated last year