☆23Jun 18, 2024Updated last year
Alternatives and similar repositories for efae
Users that are interested in efae are comparing it to the libraries listed below
Sorting:
- ☆13Jun 3, 2024Updated last year
- ☆23Oct 15, 2024Updated last year
- WIP☆94Aug 13, 2024Updated last year
- ☆27May 3, 2024Updated last year
- ☆34Sep 10, 2024Updated last year
- ☆33Nov 4, 2024Updated last year
- An implementation of the Llama architecture, to instruct and delight☆21May 31, 2025Updated 8 months ago
- ☆47Jan 18, 2024Updated 2 years ago
- JAX Scalify: end-to-end scaled arithmetics☆18Oct 30, 2024Updated last year
- ☆48Feb 23, 2025Updated last year
- Focused on fast experimentation and simplicity☆80Dec 24, 2024Updated last year
- ☆12Jan 4, 2024Updated 2 years ago
- recipe for training fully-featured self supervised image jepa models☆12Jun 4, 2025Updated 8 months ago
- Maximal Update Parametrization (μP) with Flax & Optax.☆16Dec 27, 2023Updated 2 years ago
- research impl of Native Sparse Attention (2502.11089)☆63Feb 19, 2025Updated last year
- nanoGPT using Equinox☆15Mar 3, 2023Updated 2 years ago
- [Oral; Neurips OPT2024 ] μLO: Compute-Efficient Meta-Generalization of Learned Optimizers☆15Feb 12, 2026Updated 2 weeks ago
- ☆34May 14, 2025Updated 9 months ago
- PyTorch implementation of StableMask (ICML'24)☆15Jun 27, 2024Updated last year
- ☆16Oct 20, 2025Updated 4 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆18Jul 24, 2025Updated 7 months ago
- ☆18Aug 24, 2024Updated last year
- ☆20May 30, 2024Updated last year
- Fast and memory-efficient exact attention☆29Dec 2, 2024Updated last year
- ☆19Dec 4, 2025Updated 2 months ago
- An efficient implementation of the NSA (Native Sparse Attention) kernel☆129Jun 24, 2025Updated 8 months ago
- Code for the paper "Function-Space Learning Rates"☆25Jun 3, 2025Updated 8 months ago
- ☆20Nov 23, 2022Updated 3 years ago
- A JAX nn library☆22Sep 9, 2025Updated 5 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Apr 17, 2024Updated last year
- GPT2 Byte Pair Encoding implementation in Golang☆24Jul 9, 2025Updated 7 months ago
- A gradio/nicegui style wrapper for flet.☆19Nov 17, 2023Updated 2 years ago
- ☆28Oct 7, 2025Updated 4 months ago
- Utilities for PyTorch distributed☆25Feb 27, 2025Updated last year
- A port of muP to JAX/Haiku☆25Oct 23, 2022Updated 3 years ago
- ☆93Jul 5, 2024Updated last year
- ☆24Sep 25, 2024Updated last year
- ☆21Mar 3, 2025Updated 11 months ago
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun☆56Mar 10, 2025Updated 11 months ago