adamkarvonen / chess_gpt_eval
A repo to evaluate various LLM's chess playing abilities.
☆64Updated 6 months ago
Related projects ⓘ
Alternatives and complementary repositories for chess_gpt_eval
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated 10 months ago
- Track the progress of LLM context utilisation☆53Updated 3 months ago
- Visualizing the internal board state of a GPT trained on chess PGN strings, and performing interventions on its internal board state and …☆192Updated 5 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆81Updated last year
- Repository for the paper Stream of Search: Learning to Search in Language☆84Updated 3 months ago
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆151Updated this week
- ☆74Updated last week
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆201Updated 5 months ago
- ☆76Updated 10 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆38Updated 2 weeks ago
- ☆99Updated 3 months ago
- ☆93Updated last year
- A codebase for "Language Models can Solve Computer Tasks"☆224Updated 6 months ago
- This repository explains and provides examples for "concept anchoring" in GPT4.☆72Updated 10 months ago
- Full finetuning of large language models without large memory requirements☆93Updated 10 months ago
- Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement☆44Updated last week
- An automated tool for discovering insights from research papaer corpora☆135Updated 5 months ago
- Just a bunch of benchmark logs for different LLMs☆113Updated 3 months ago
- Memoria is a human-inspired memory architecture for neural networks.☆57Updated 3 weeks ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆62Updated last month
- ☆48Updated last year
- Evaluating LLMs with CommonGen-Lite☆84Updated 7 months ago
- ☆72Updated last year
- Aidan Bench attempts to measure <big_model_smell> in LLMs.☆89Updated 3 weeks ago
- Draw more samples☆174Updated 4 months ago
- ☆101Updated last month
- A repository for training nanogpt-based Chess playing language models.☆22Updated 6 months ago
- [NeurIPS '23 Spotlight] Thought Cloning: Learning to Think while Acting by Imitating Human Thinking☆252Updated 4 months ago
- ☆135Updated 6 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆119Updated 2 weeks ago