evanatyourservice / llm-jaxView external linksLinks
Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.
☆18Jul 24, 2025Updated 6 months ago
Alternatives and similar repositories for llm-jax
Users that are interested in llm-jax are comparing it to the libraries listed below
Sorting:
- a Jax/Flax inference code of StarCoder☆12Jun 12, 2023Updated 2 years ago
- Maximal Update Parametrization (μP) with Flax & Optax.☆16Dec 27, 2023Updated 2 years ago
- Implementation of PSGD optimizer in JAX☆35Dec 31, 2024Updated last year
- An AI character interaction system with emotional modeling and advanced memory management☆17Oct 26, 2024Updated last year
- Minimal but scalable implementation of large language models in JAX☆35Nov 28, 2025Updated 2 months ago
- ☆16Oct 20, 2025Updated 3 months ago
- supporting pytorch FSDP for optimizers☆84Dec 8, 2024Updated last year
- Efficient optimizers☆283Dec 20, 2025Updated last month
- ☆23Jan 5, 2025Updated last year
- ☆20Nov 18, 2024Updated last year
- ☆22Dec 15, 2023Updated 2 years ago
- ☆24Dec 16, 2024Updated last year
- A set of Python scripts that makes your experience on TPU better☆56Sep 18, 2025Updated 4 months ago
- quick playground to animate pippin☆14Nov 11, 2024Updated last year
- Jax/Flax rewrite of Karpathy's nanoGPT☆63Feb 15, 2023Updated 3 years ago
- A flexible and efficient implementation of Flash Attention 2.0 for JAX, supporting multiple backends (GPU/TPU/CPU) and platforms (Triton/…☆34Mar 4, 2025Updated 11 months ago
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…☆190Jan 11, 2026Updated last month
- ☆33Nov 4, 2024Updated last year
- Jax like function transformation engine but micro, microjax☆34Oct 25, 2024Updated last year
- A simple library for scaling up JAX programs☆145Nov 4, 2025Updated 3 months ago
- some common Huggingface transformers in maximal update parametrization (µP)☆87Mar 14, 2022Updated 3 years ago
- Run TFLITE models on the web☆12Jan 2, 2022Updated 4 years ago
- Evaluating language models on word puzzle games☆10Oct 25, 2024Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆187Jan 19, 2026Updated 3 weeks ago
- Training code for Sparse Autoencoders on Embedding models☆39Feb 27, 2025Updated 11 months ago
- Artificial stock market (ASM) with Julia language.☆10Aug 10, 2021Updated 4 years ago
- Research sources on graph-based anomaly detection☆13Nov 29, 2022Updated 3 years ago
- ApertureDB Python Client☆12Jan 14, 2026Updated last month
- Discord Docsbot, Built on bgent☆11Jun 17, 2024Updated last year
- an autonomous independent digital companion☆14Feb 10, 2026Updated last week
- ☆12Jul 8, 2024Updated last year
- Extract streaming data from text using prefix completion.☆10Oct 6, 2024Updated last year
- PyTorch Implementation of Context-Aware Sequential Model for Multi-Behaviour Recommendation https://arxiv.org/abs/2312.09684☆10May 31, 2024Updated last year
- Text preprocessing package for use in NLP tasks https://pypi.org/project/textcl/☆11Aug 9, 2024Updated last year
- Cookiecutter template for making a cog for Red.☆12Jun 18, 2024Updated last year
- Master control of robot using esp32 chip with openmv and tensorflow-lite support.☆11Mar 6, 2023Updated 2 years ago
- Machine Learning eXperiment Utilities☆48Jul 29, 2025Updated 6 months ago
- Make triton easier☆50Jun 12, 2024Updated last year
- 个人学习中总结的 Rust 思维导图☆10Feb 2, 2024Updated 2 years ago