PraveenRaja42 / Tiny-Stories-GPTLinks
A minimal PyTorch re-implementation of GPT (Generative Pretrained Transformer) language model training
☆18Updated 2 years ago
Alternatives and similar repositories for Tiny-Stories-GPT
Users that are interested in Tiny-Stories-GPT are comparing it to the libraries listed below
Sorting:
- Simple repository for training small reasoning models☆47Updated 10 months ago
- A collection of lightweight interpretability scripts to understand how LLMs think☆71Updated this week
- NanoGPT-speedrunning for the poor T4 enjoyers☆73Updated 7 months ago
- A place to store reusable transformer components of my own creation or found on the interwebs☆63Updated last week
- Simple GRPO scripts and configurations.☆59Updated 10 months ago
- ☆40Updated last year
- Collection of autoregressive model implementation☆85Updated 7 months ago
- This repository contain the simple llama3 implementation in pure jax.☆70Updated 10 months ago
- ML/DL Math and Method notes☆65Updated 2 years ago
- LLM training in simple, raw C/CUDA☆15Updated last year
- ☆28Updated last year
- Jax like function transformation engine but micro, microjax☆34Updated last year
- QLoRA for Masked Language Modeling☆22Updated 2 years ago
- ☆59Updated last month
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Updated 2 months ago
- Rust Implementation of micrograd☆53Updated last year
- ☆53Updated 10 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆103Updated last year
- ☆55Updated last year
- An introduction to LLM Sampling☆79Updated last year
- ☆82Updated last year
- ☆62Updated 2 years ago
- ☆86Updated last year
- Andrej Kapathy's micrograd implemented in c☆30Updated last year
- Various handy scripts to quickly setup new Linux and Windows sandboxes, containers and WSL.☆40Updated this week
- A sample pattern for running CI tests on Modal☆18Updated 8 months ago
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆70Updated 2 years ago
- NanoGPT (124M) quality in 2.67B tokens☆28Updated 3 months ago
- Simplex Random Feature attention, in PyTorch☆75Updated 2 years ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Updated last year