Simple Transformer in Jax
☆143Jun 22, 2024Updated last year
Alternatives and similar repositories for simple_transformer
Users that are interested in simple_transformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆40Jul 26, 2024Updated last year
- Entropy Based Sampling and Parallel CoT Decoding☆3,435Nov 13, 2024Updated last year
- Training code for Sparse Autoencoders on Embedding models☆39May 9, 2026Updated 2 weeks ago
- Jax like function transformation engine but micro, microjax☆34Oct 25, 2024Updated last year
- smol models are fun too☆94Nov 9, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Frechet inception distance (FID) evaluation in JAX☆14May 28, 2024Updated last year
- smolLM with Entropix sampler on pytorch☆149Oct 31, 2024Updated last year
- A graph visualization of attention☆56May 20, 2025Updated last year
- ☆13Jun 18, 2024Updated last year
- ☆14Apr 16, 2025Updated last year
- An implementation of the Llama architecture, to instruct and delight☆21May 31, 2025Updated 11 months ago
- Sparsify transformers with SAEs and transcoders☆721Updated this week
- Training Models Daily☆16Dec 19, 2023Updated 2 years ago
- High Quality Resources on GPU Programming/Architecture☆592Jul 26, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- quicktart template for LLM projects. includes useful tools, logging, cost tracking etc, v easy☆13Feb 27, 2025Updated last year
- DeMo: Decoupled Momentum Optimization☆201Dec 2, 2024Updated last year
- Knowledge base Claude application☆43Jan 3, 2026Updated 4 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆73Apr 22, 2025Updated last year
- Automatically annotates YOLO dataset using Moondream visual model☆21Aug 24, 2025Updated 9 months ago
- Reasoning Computers. Lambda Calculus, Fully Differentiable. Also Neural Stacks, Queues, Arrays, Lists, Trees, and Latches.☆288Nov 3, 2024Updated last year
- A light tensor library in zig.☆77Feb 9, 2025Updated last year
- Build your own visual reasoning model☆422Jan 13, 2026Updated 4 months ago
- Token-level adaptation of LoRA matrices for downstream task generalization.☆15Apr 14, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆27Jul 9, 2024Updated last year
- It's a baby compiler. (Lean btw.)☆16May 19, 2025Updated last year
- utilities for batched llm calls with retries☆50Apr 23, 2026Updated last month
- NSA Triton Kernels written with GPT5 and Opus 4.1☆70Aug 12, 2025Updated 9 months ago
- Our library for RL environments + evals☆4,125Updated this week
- Minimal implementation of scalable rectified flow transformers, based on SD3's approach☆634Jul 1, 2024Updated last year
- look how they massacred my boy☆63Oct 16, 2024Updated last year
- See https://github.com/cuda-mode/triton-index/ instead!☆11May 8, 2024Updated 2 years ago
- ☆33Nov 4, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- supporting pytorch FSDP for optimizers☆84Dec 8, 2024Updated last year
- An automated tool for discovering insights from research papaer corpora☆137Jun 8, 2024Updated last year
- ☆93Jul 5, 2024Updated last year
- ☆12Jun 2, 2023Updated 2 years ago
- NanoGPT (124M) in 90 seconds☆5,270May 14, 2026Updated last week
- Smart reproducible analytical pipeline inspection☆21Feb 13, 2026Updated 3 months ago
- ☆22Nov 9, 2024Updated last year