Simple Transformer in Jax
☆143Jun 22, 2024Updated last year
Alternatives and similar repositories for simple_transformer
Users that are interested in simple_transformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆40Jul 26, 2024Updated last year
- Entropy Based Sampling and Parallel CoT Decoding☆3,431Nov 13, 2024Updated last year
- Training code for Sparse Autoencoders on Embedding models☆39Apr 5, 2026Updated last week
- Jax like function transformation engine but micro, microjax☆34Oct 25, 2024Updated last year
- smol models are fun too☆93Nov 9, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Frechet inception distance (FID) evaluation in JAX☆14May 28, 2024Updated last year
- smolLM with Entropix sampler on pytorch☆149Oct 31, 2024Updated last year
- A graph visualization of attention☆56May 20, 2025Updated 10 months ago
- ☆13Jun 18, 2024Updated last year
- ☆14Apr 16, 2025Updated last year
- An implementation of the Llama architecture, to instruct and delight☆21May 31, 2025Updated 10 months ago
- Sparsify transformers with SAEs and transcoders☆705Apr 6, 2026Updated last week
- ☆308Jul 15, 2024Updated last year
- Training Models Daily☆16Dec 19, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- gzip Predicts Data-dependent Scaling Laws☆35May 28, 2024Updated last year
- High Quality Resources on GPU Programming/Architecture☆592Jul 26, 2024Updated last year
- DeMo: Decoupled Momentum Optimization☆198Dec 2, 2024Updated last year
- An introduction to LLM Sampling☆80Dec 15, 2024Updated last year
- NanoGPT-speedrunning for the poor T4 enjoyers☆73Apr 22, 2025Updated 11 months ago
- Automatically annotates YOLO dataset using Moondream visual model☆20Aug 24, 2025Updated 7 months ago
- Reasoning Computers. Lambda Calculus, Fully Differentiable. Also Neural Stacks, Queues, Arrays, Lists, Trees, and Latches.☆285Nov 3, 2024Updated last year
- A light tensor library in zig.☆77Feb 9, 2025Updated last year
- Build your own visual reasoning model☆421Jan 13, 2026Updated 3 months ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- No frills LLM-assisted programming☆251Jul 24, 2024Updated last year
- utilities for batched llm calls with retries☆49Apr 8, 2026Updated last week
- Token-level adaptation of LoRA matrices for downstream task generalization.☆15Apr 14, 2024Updated 2 years ago
- ☆27Jul 9, 2024Updated last year
- It's a baby compiler. (Lean btw.)☆16May 19, 2025Updated 10 months ago
- Our library for RL environments + evals☆3,986Updated this week
- NSA Triton Kernels written with GPT5 and Opus 4.1☆70Aug 12, 2025Updated 8 months ago
- Minimal implementation of scalable rectified flow transformers, based on SD3's approach☆635Jul 1, 2024Updated last year
- look how they massacred my boy☆63Oct 16, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆33Nov 4, 2024Updated last year
- supporting pytorch FSDP for optimizers☆84Dec 8, 2024Updated last year
- An automated tool for discovering insights from research papaer corpora☆137Jun 8, 2024Updated last year
- ☆92Jul 5, 2024Updated last year
- ☆12Jun 2, 2023Updated 2 years ago
- NanoGPT (124M) in 2 minutes☆5,070Mar 29, 2026Updated 2 weeks ago
- Smart reproducible analytical pipeline inspection☆21Feb 13, 2026Updated 2 months ago