tanishqkumar / beyond-nanogpt
Minimal and annotated implementations of key ideas from modern deep learning research.
☆270Updated this week
Alternatives and similar repositories for beyond-nanogpt:
Users that are interested in beyond-nanogpt are comparing it to the libraries listed below
- Textbook on reinforcement learning from human feedback☆795Updated this week
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆385Updated 2 weeks ago
- Build your own visual reasoning model☆341Updated this week
- Exploring Applications of GRPO☆185Updated last week
- The Multilayer Perceptron Language Model☆543Updated 8 months ago
- Following master Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish☆174Updated 8 months ago
- A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.☆337Updated last month
- An extension of the nanoGPT repository for training small MOE models.☆131Updated last month
- The Autograd Engine☆597Updated 7 months ago
- ☆218Updated this week
- System 2 Reasoning Link Collection☆826Updated last month
- The Tensor (or Array)☆429Updated 8 months ago
- ☆144Updated last month
- Simple and readable code for training and sampling from diffusion models☆478Updated 3 months ago
- An example starter repo for Python projects☆279Updated last month
- procedural reasoning datasets☆571Updated this week
- 🤗 Benchmark Large Language Models Reliably On Your Data☆240Updated last week
- Educational implementation of a small GPT model from scratch in a single Jupyter Notebook☆91Updated 2 months ago
- PyTorch building blocks for the OLMo ecosystem☆197Updated this week
- A simple tool that let's you explore different possible paths that an LLM might sample.☆163Updated 2 weeks ago
- Getting crystal-like representations with harmonic loss☆182Updated 3 weeks ago
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆786Updated last month
- A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and full…☆607Updated last month
- Minimalistic 4D-parallelism distributed training framework for education purpose☆991Updated last month
- An ML Systems Onboarding list☆756Updated 3 months ago
- CodeScientist: An automated scientific discovery system for code-based experiments☆221Updated 3 weeks ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆304Updated 5 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆304Updated 6 months ago
- Verifiers for LLM Reinforcement Learning☆827Updated 3 weeks ago
- Recipes to scale inference-time compute of open models☆1,058Updated 2 months ago