francoisfleuret / picogptLinks

Minimal GPT (~350 lines with a simple task to test it)

☆62

Alternatives and similar repositories for picogpt

Users that are interested in picogpt are comparing it to the libraries listed below

Sorting:

BlackHC / neural_net_checklist
☆150Updated 11 months ago
Artur-Galstyan / statedict2pytree
☆43Updated 2 months ago
Sohl-Dickstein / fractal
The boundary of neural network trainability is fractal
☆215Updated last year
probabilists / azula
Diffusion models in PyTorch
☆107Updated last month
evanatyourservice / kron_torch
An implementation of PSGD Kron second-order optimizer for PyTorch
☆94Updated 2 weeks ago
vtabbott / Algebraic-NCD
A package for defining deep learning models using categorical algebraic expressions.
☆61Updated last year
lucidrains / grokfast-pytorch
Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"
☆101Updated 7 months ago
KindXiaoming / grow-crystals
Getting crystal-like representations with harmonic loss
☆192Updated 4 months ago
joey00072 / microjax
Jax like function transformation engine but micro, microjax
☆33Updated 9 months ago
idiap / sigma-gpt
σ-GPT: A New Approach to Autoregressive Models
☆67Updated 11 months ago
sytelus / pcprep
Various handy scripts to quickly setup new Linux and Windows sandboxes, containers and WSL.
☆40Updated 3 months ago
cgarciae / einop
☆60Updated 3 years ago
main-horse / hnet
H-Net Dynamic Hierarchical Architecture
☆65Updated 2 weeks ago
apoorvkh / academic-pretraining
$100K or 100 Days: Trade-offs when Pre-Training with Academic Resources
☆143Updated 2 months ago
srush / Tensor-Puzzles-Penzai
☆21Updated last year
cloneofsimo / zeroshampoo
☆34Updated 10 months ago
yobibyte / report
Because we don't want a jupyter notebook mess...
☆61Updated last month
riverstone496 / awesome-second-order-optimization
☆27Updated last year
kvfrans / jax-flow
Flow-matching algorithms in JAX
☆100Updated 11 months ago
cloneofsimo / min-max-gpt
Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training
☆130Updated last year
cloneofsimo / scaling-guide
WIP
☆94Updated 11 months ago
jxmorris12 / gptzip
Losslessly encode text natively with arithmetic coding and HuggingFace Transformers
☆76Updated last year
okarthikb / state-space-models
☆27Updated last year
fal-ai / diffusion-speedrun
Focused on fast experimentation and simplicity
☆76Updated 7 months ago
lucidrains / spline-based-transformer
Implementation of the proposed Spline-Based Transformer from Disney Research
☆102Updated 8 months ago
lucidrains / GAF-microbatch-pytorch
Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch
☆25Updated 6 months ago
dvruette / barrel-rec-pytorch
☆53Updated last year
CLAIRE-Labo / flash_attention
A basic pure pytorch implementation of flash attention
☆16Updated 9 months ago
VatsaDev / NanoPoor
NanoGPT-speedrunning for the poor T4 enjoyers
☆68Updated 3 months ago
uzulim / hades
Fast singularity detection with kernel
☆35Updated last year