PraveenRaja42 / Tiny-Stories-GPTLinks
A minimal PyTorch re-implementation of GPT (Generative Pretrained Transformer) language model training
☆17Updated 2 years ago
Alternatives and similar repositories for Tiny-Stories-GPT
Users that are interested in Tiny-Stories-GPT are comparing it to the libraries listed below
Sorting:
- This repository contain the simple llama3 implementation in pure jax.☆70Updated 9 months ago
- ☆28Updated last year
- Various handy scripts to quickly setup new Linux and Windows sandboxes, containers and WSL.☆40Updated 2 weeks ago
- Jax like function transformation engine but micro, microjax☆33Updated last year
- ☆40Updated last year
- Simple repository for training small reasoning models☆46Updated 9 months ago
- A place to store reusable transformer components of my own creation or found on the interwebs☆62Updated last week
- gzip Predicts Data-dependent Scaling Laws☆34Updated last year
- Andrej Kapathy's micrograd implemented in c☆30Updated last year
- Functional local implementations of main model parallelism approaches☆96Updated 2 years ago
- LLM training in simple, raw C/CUDA☆15Updated 11 months ago
- A really tiny autograd engine☆96Updated 6 months ago
- Simple GRPO scripts and configurations.☆59Updated 9 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆73Updated 7 months ago
- ☆54Updated last year
- Simplex Random Feature attention, in PyTorch☆75Updated 2 years ago
- Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch☆25Updated 10 months ago
- Rust Implementation of micrograd☆53Updated last year
- ☆94Updated 2 years ago
- Serialize JAX, Flax, Haiku, or Objax model params with 🤗`safetensors`☆47Updated last year
- ☆53Updated last year
- ☆144Updated 2 years ago
- ML/DL Math and Method notes☆64Updated last year
- DiCE: The Infinitely Differentiable Monte-Carlo Estimator☆32Updated 2 years ago
- ☆10Updated last year
- A sample pattern for running CI tests on Modal☆18Updated 7 months ago
- Simple Transformer in Jax☆139Updated last year
- HomebrewNLP in JAX flavour for maintable TPU-Training☆51Updated last year
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆103Updated 11 months ago
- ☆22Updated 2 years ago