HanClinto / MENTATLinks
☆9Updated 8 months ago
Alternatives and similar repositories for MENTAT
Users that are interested in MENTAT are comparing it to the libraries listed below
Sorting:
- A repo to evaluate various LLM's chess playing abilities.☆81Updated last year
- a curated list of data for reasoning ai☆136Updated 11 months ago
- Visualize the intermediate output of Mistral 7B☆366Updated 5 months ago
- Teaching transformers to play chess☆131Updated 5 months ago
- ☆116Updated 5 months ago
- Bayesian Optimization as a Coverage Tool for Evaluating LLMs. Accurate evaluation (benchmarking) that's 10 times faster with just a few l…☆286Updated 2 weeks ago
- gzip Predicts Data-dependent Scaling Laws☆35Updated last year
- Visualizing the internal board state of a GPT trained on chess PGN strings, and performing interventions on its internal board state and …☆206Updated 8 months ago
- ☆215Updated 4 months ago
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆240Updated 5 months ago
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆126Updated 2 months ago
- A multi-player tournament benchmark that tests LLMs in social reasoning, strategy, and deception. Players engage in public and private co…☆282Updated this week
- Thorn in a HaizeStack test for evaluating long-context adversarial robustness.☆26Updated 11 months ago
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆54Updated 4 months ago
- Visualize text embeddings☆40Updated 2 years ago
- An attribution library for LLMs☆42Updated 10 months ago
- ☆87Updated 6 months ago
- Stop messing around with finicky sampling parameters and just use DRµGS!☆349Updated last year
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆80Updated last year
- Controlled Text Generation via Language Model Arithmetic☆222Updated 10 months ago
- Alice in Wonderland code base for experiments and raw experiments data☆131Updated 3 weeks ago
- Dead Simple LLM Abliteration☆224Updated 5 months ago
- Functional Benchmarks and the Reasoning Gap☆88Updated 9 months ago
- Benchmark LLM reasoning capability by solving chess puzzles.☆83Updated 2 months ago
- ☆61Updated last year
- ☆101Updated 5 months ago
- Code for the Fractured Entangled Representation Hypothesis position paper!☆135Updated 2 months ago
- ☆134Updated 3 months ago
- ☆122Updated 11 months ago
- Mistral7B playing DOOM☆132Updated last year