adamkarvonen / chess_llm_interpretability
Visualizing the internal board state of a GPT trained on chess PGN strings, and performing interventions on its internal board state and representation of player Elo.
☆199Updated 2 months ago
Alternatives and similar repositories for chess_llm_interpretability:
Users that are interested in chess_llm_interpretability are comparing it to the libraries listed below
- A repo to evaluate various LLM's chess playing abilities.☆75Updated 10 months ago
- A repository for training nanogpt-based Chess playing language models.☆23Updated 9 months ago
- Mistral7B playing DOOM☆127Updated 7 months ago
- Grandmaster-Level Chess Without Search☆550Updated last month
- Visualize the intermediate output of Mistral 7B☆338Updated 3 weeks ago
- An implementation of bucketMul LLM inference☆215Updated 7 months ago
- The history files when recording human interaction while solving ARC tasks☆97Updated this week
- a small code base for training large models☆284Updated last month
- Simple Transformer in Jax☆136Updated 7 months ago
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆164Updated this week
- A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and full…☆604Updated 2 months ago
- LLM verified with Monte Carlo Tree Search☆264Updated last week
- Draw more samples☆186Updated 7 months ago
- ☆143Updated last year
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated last year
- Teaching transformers to play chess☆113Updated 3 weeks ago
- Full finetuning of large language models without large memory requirements☆93Updated last year
- Official codebase for the paper "Beyond A* Better Planning with Transformers via Search Dynamics Bootstrapping".☆355Updated 8 months ago
- Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…☆341Updated 6 months ago
- A pure NumPy implementation of Mamba.☆219Updated 7 months ago
- A crude RLHF layer on top of nanoGPT with Gumbel-Softmax trick☆289Updated last year
- run paligemma in real time☆130Updated 8 months ago
- Our solution for the arc challenge 2024☆98Updated 2 months ago
- ☆92Updated last year
- Benchmark LLM reasoning capability by solving chess puzzles.☆72Updated 8 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- Stop messing around with finicky sampling parameters and just use DRµGS!☆342Updated 8 months ago
- Implement recursion using English as the programming language and an LLM as the runtime.☆136Updated last year
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆183Updated 8 months ago
- ☆207Updated 7 months ago