bclarkson-code / Tricycle
Autograd to GPT-2 completely from scratch
☆107Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for Tricycle
- A tiny version of GPT fully implemented in Python with zero dependencies☆60Updated 2 months ago
- a curated list of data for reasoning ai☆113Updated 3 months ago
- A pure NumPy implementation of Mamba.☆216Updated 4 months ago
- Mistral7B playing DOOM☆122Updated 4 months ago
- Visualizing the internal board state of a GPT trained on chess PGN strings, and performing interventions on its internal board state and …☆193Updated this week
- A really tiny autograd engine☆87Updated 7 months ago
- A BERT that you can train on a (gaming) laptop.☆211Updated last year
- Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).☆250Updated last year
- Cerule - A Tiny Mighty Vision Model☆67Updated 2 months ago
- Visualize the intermediate output of Mistral 7B☆316Updated 9 months ago
- Video+code lecture on building nanoGPT from scratch☆64Updated 5 months ago
- throwaway GPT inference☆139Updated 5 months ago
- ☆234Updated 8 months ago
- a small code base for training large models☆268Updated 3 weeks ago
- Following master Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish☆167Updated 3 months ago
- look how they massacred my boy☆58Updated last month
- Simple Transformer in Jax☆119Updated 5 months ago
- A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and full…☆461Updated this week
- Just large language models. Hackable, with as little abstraction as possible. Done for my own purposes, feel free to rip.☆44Updated last year
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆104Updated 2 months ago
- Alice in Wonderland code base for experiments and raw experiments data☆109Updated last month
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆56Updated this week
- An implementation of bucketMul LLM inference☆214Updated 4 months ago
- PyTorch implementation of models from the Zamba2 series.☆158Updated this week
- run paligemma in real time☆123Updated 6 months ago
- Fast approximate inference on a single GPU with sparsity aware offloading☆38Updated 10 months ago
- ☆49Updated 8 months ago
- Generate ideal question-answers for testing RAG☆123Updated 4 months ago
- Documented and Unit Tested educational Deep Learning framework with Autograd from scratch.☆105Updated 7 months ago
- A repo to evaluate various LLM's chess playing abilities.☆68Updated 7 months ago