bclarkson-code / Tricycle
Autograd to GPT-2 completely from scratch
☆110Updated 2 weeks ago
Alternatives and similar repositories for Tricycle:
Users that are interested in Tricycle are comparing it to the libraries listed below
- Mistral7B playing DOOM☆127Updated 7 months ago
- A really tiny autograd engine☆89Updated 10 months ago
- Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).☆250Updated last year
- a curated list of data for reasoning ai☆128Updated 6 months ago
- Documented and Unit Tested educational Deep Learning framework with Autograd from scratch.☆110Updated 10 months ago
- Visualizing the internal board state of a GPT trained on chess PGN strings, and performing interventions on its internal board state and …☆200Updated 3 months ago
- run paligemma in real time☆130Updated 9 months ago
- A pure NumPy implementation of Mamba.☆219Updated 7 months ago
- Visualize the intermediate output of Mistral 7B☆339Updated 3 weeks ago
- look how they massacred my boy☆63Updated 4 months ago
- Alice in Wonderland code base for experiments and raw experiments data☆127Updated last week
- PyTorch implementation of models from the Zamba2 series.☆176Updated 3 weeks ago
- ☆111Updated 2 weeks ago
- Official implementation of MetaTree: Learning a Decision Tree Algorithm with Transformers☆104Updated 5 months ago
- Simple Transformer in Jax☆136Updated 7 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆116Updated 2 months ago
- ☆239Updated 11 months ago
- High-Performance FP32 Matrix Multiplication on CPU☆333Updated this week
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated last year
- Diffusion on syntax trees for program synthesis☆442Updated 7 months ago
- LLM verified with Monte Carlo Tree Search☆264Updated last week
- Inference of Mamba models in pure C☆183Updated 11 months ago
- The history files when recording human interaction while solving ARC tasks☆97Updated this week
- Official codebase for the paper "Beyond A* Better Planning with Transformers via Search Dynamics Bootstrapping".☆358Updated 8 months ago
- throwaway GPT inference☆140Updated 8 months ago
- A tiny version of GPT fully implemented in Python with zero dependencies☆63Updated 2 months ago
- ☆160Updated 8 months ago
- Cerule - A Tiny Mighty Vision Model☆67Updated 5 months ago
- Alex Krizhevsky's original code from Google Code☆189Updated 8 years ago