Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.
☆18Jul 24, 2025Updated 7 months ago
Alternatives and similar repositories for llm-jax
Users that are interested in llm-jax are comparing it to the libraries listed below
Sorting:
- Maximal Update Parametrization (μP) with Flax & Optax.☆16Dec 27, 2023Updated 2 years ago
- a Jax/Flax inference code of StarCoder☆12Jun 12, 2023Updated 2 years ago
- Implementation of PSGD optimizer in JAX☆35Dec 31, 2024Updated last year
- An AI character interaction system with emotional modeling and advanced memory management☆17Oct 26, 2024Updated last year
- ☆16Oct 20, 2025Updated 4 months ago
- LLM training in simple, raw C/CUDA☆15Dec 5, 2024Updated last year
- supporting pytorch FSDP for optimizers☆84Dec 8, 2024Updated last year
- ☆23Jan 5, 2025Updated last year
- ☆20Nov 18, 2024Updated last year
- Efficient optimizers☆285Dec 20, 2025Updated 2 months ago
- ☆23Jun 18, 2024Updated last year
- ☆22Dec 15, 2023Updated 2 years ago
- ☆24Dec 16, 2024Updated last year
- A set of Python scripts that makes your experience on TPU better☆56Sep 18, 2025Updated 5 months ago
- ☆12Jan 17, 2026Updated last month
- quick playground to animate pippin☆15Nov 11, 2024Updated last year
- Jax/Flax rewrite of Karpathy's nanoGPT☆63Feb 15, 2023Updated 3 years ago
- A flexible and efficient implementation of Flash Attention 2.0 for JAX, supporting multiple backends (GPU/TPU/CPU) and platforms (Triton/…☆34Mar 4, 2025Updated last year
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…☆190Jan 11, 2026Updated last month
- Official implementation of "BERTs are Generative In-Context Learners"☆32Mar 14, 2025Updated 11 months ago
- Jax like function transformation engine but micro, microjax☆34Oct 25, 2024Updated last year
- ☆33Nov 4, 2024Updated last year
- A simple library for scaling up JAX programs☆146Nov 4, 2025Updated 4 months ago
- some common Huggingface transformers in maximal update parametrization (µP)☆87Mar 14, 2022Updated 3 years ago
- Tiny AutoEncoder for Stable Diffusion Videos☆36Oct 5, 2024Updated last year
- Training code for Sparse Autoencoders on Embedding models☆39Feb 27, 2025Updated last year
- Tool for generating pictures using mathematical formulas.☆43Nov 8, 2021Updated 4 years ago
- Master control of robot using esp32 chip with openmv and tensorflow-lite support.☆11Mar 6, 2023Updated 3 years ago
- This project compares the performance of Swin-Transformer v2 implemented in JAX and PyTorch.☆12Jun 8, 2022Updated 3 years ago
- Discord Docsbot, Built on bgent☆11Jun 17, 2024Updated last year
- PyTorch Implementation of Context-Aware Sequential Model for Multi-Behaviour Recommendation https://arxiv.org/abs/2312.09684☆10May 31, 2024Updated last year
- Extract streaming data from text using prefix completion.☆10Oct 6, 2024Updated last year
- Research sources on graph-based anomaly detection☆13Nov 29, 2022Updated 3 years ago
- an autonomous independent digital companion☆14Feb 12, 2026Updated 3 weeks ago
- ☆12Jul 8, 2024Updated last year
- ☆47Feb 26, 2026Updated last week
- Cookiecutter template for making a cog for Red.☆12Jun 18, 2024Updated last year
- The Ultimate OpenCode Starter Kit. Includes Oh My OpenCode config, Superpowers installation fix, MCP Setup, and Windows Crash Fix (exit_c…☆18Feb 10, 2026Updated 3 weeks ago
- VeighNa框架的LevelDB数据库接口☆13Apr 23, 2023Updated 2 years ago