Simple Transformer in Jax
☆143Jun 22, 2024Updated last year
Alternatives and similar repositories for simple_transformer
Users that are interested in simple_transformer are comparing it to the libraries listed below
Sorting:
- ☆40Jul 26, 2024Updated last year
- Entropy Based Sampling and Parallel CoT Decoding☆3,434Nov 13, 2024Updated last year
- Jax like function transformation engine but micro, microjax☆34Oct 25, 2024Updated last year
- Training Models Daily☆16Dec 19, 2023Updated 2 years ago
- Training code for Sparse Autoencoders on Embedding models☆39Feb 27, 2025Updated last year
- smolLM with Entropix sampler on pytorch☆149Oct 31, 2024Updated last year
- An implementation of the Llama architecture, to instruct and delight☆21May 31, 2025Updated 9 months ago
- A graph visualization of attention☆57May 20, 2025Updated 9 months ago
- gzip Predicts Data-dependent Scaling Laws☆34May 28, 2024Updated last year
- Frechet inception distance (FID) evaluation in JAX☆14May 28, 2024Updated last year
- ☆93Jul 5, 2024Updated last year
- utilities for batched llm calls with retries☆46Feb 26, 2026Updated last week
- Build your own visual reasoning model☆419Jan 13, 2026Updated last month
- ☆292Jul 15, 2024Updated last year
- smol models are fun too☆93Nov 9, 2024Updated last year
- DeMo: Decoupled Momentum Optimization