HanClinto / MENTATLinks

☆9

Alternatives and similar repositories for MENTAT

Users that are interested in MENTAT are comparing it to the libraries listed below

Sorting:

adamkarvonen / chess_gpt_eval
A repo to evaluate various LLM's chess playing abilities.
☆81Updated last year
neurallambda / awesome-reasoning
a curated list of data for reasoning ai
☆136Updated 11 months ago
valine / NeuralFlow
Visualize the intermediate output of Mistral 7B
☆366Updated 5 months ago
sgrvinod / chess-transformers
Teaching transformers to play chess
☆131Updated 5 months ago
llmonpy / needle-in-a-needlestack
☆116Updated 5 months ago
rentruewang / bocoel
Bayesian Optimization as a Coverage Tool for Evaluating LLMs. Accurate evaluation (benchmarking) that's 10 times faster with just a few l…
☆286Updated 2 weeks ago
KhoomeiK / complexity-scaling
gzip Predicts Data-dependent Scaling Laws
☆35Updated last year
adamkarvonen / chess_llm_interpretability
Visualizing the internal board state of a GPT trained on chess PGN strings, and performing interventions on its internal board state and …
☆206Updated 8 months ago
em-llm / EM-LLM-model
☆215Updated 4 months ago
Mihaiii / llm_steer
Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…
☆240Updated 5 months ago
valine / training-hot-swap
Pytorch script hot swap: Change code without unloading your LLM from VRAM
☆126Updated 2 months ago
lechmazur / elimination_game
A multi-player tournament benchmark that tests LLMs in social reasoning, strategy, and deception. Players engage in public and private co…
☆282Updated this week
haizelabs / thorn-in-haizestack
Thorn in a HaizeStack test for evaluating long-context adversarial robustness.
☆26Updated 11 months ago
egozverev / Should-It-Be-Executed-Or-Processed
Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.
☆54Updated 4 months ago
colehaus / hammock-public
Visualize text embeddings
☆40Updated 2 years ago
leap-laboratories / PIZZA
An attribution library for LLMs
☆42Updated 10 months ago
joshuacnf / Ctrl-G
☆87Updated 6 months ago
EGjoni / DRUGS
Stop messing around with finicky sampling parameters and just use DRµGS!
☆349Updated last year
Hannibal046 / nanoColBERT
Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).
☆80Updated last year
eth-sri / language-model-arithmetic
Controlled Text Generation via Language Model Arithmetic
☆222Updated 10 months ago
LAION-AI / AIW
Alice in Wonderland code base for experiments and raw experiments data
☆131Updated 3 weeks ago
Tsadoq / ErisForge
Dead Simple LLM Abliteration
☆224Updated 5 months ago
ConsequentAI / fneval
Functional Benchmarks and the Reasoning Gap
☆88Updated 9 months ago
kagisearch / llm-chess-puzzles
Benchmark LLM reasoning capability by solving chess puzzles.
☆83Updated 2 months ago
NousResearch / StripedHyenaTrainer
☆61Updated last year
KihoPark / LLM_Categorical_Hierarchical_Representations
☆101Updated 5 months ago
akarshkumar0101 / fer
Code for the Fractured Entangled Representation Hypothesis position paper!
☆135Updated 2 months ago
google-deepmind / mishax
☆134Updated 3 months ago
normal-computing / extended-mind-transformers
☆122Updated 11 months ago
umuthopeyildirim / DOOM-Mistral
Mistral7B playing DOOM
☆132Updated last year