keskival / recursive-self-improvement-suite
A suite of open-ended, non-imitative tasks involving generalizable skills for large language model chatbots and agents to enable bootstrapped recursive self-improvement and an unambiguous AGI.
☆28Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for recursive-self-improvement-suite
- Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation☆29Updated 10 months ago
- Evaluation of neuro-symbolic engines☆33Updated 3 months ago
- [ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use☆115Updated 8 months ago
- Fun project to run your own LLM chat bot using llama.cpp☆11Updated last year
- Code and Data for "MIRAI: Evaluating LLM Agents for Event Forecasting"☆55Updated 4 months ago
- Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents☆111Updated 5 months ago
- Code for "Attention in Large Language Models Yeilds Efficient Zero-Shot Re-Rankers"☆11Updated last month
- Functional Benchmarks and the Reasoning Gap☆78Updated last month
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆162Updated last month
- Repository for the paper Stream of Search: Learning to Search in Language☆93Updated 3 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆128Updated last month
- ☆78Updated 11 months ago
- Based on the tree of thoughts paper☆45Updated last year
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆39Updated last month
- ☆43Updated 2 months ago
- ☆112Updated last month
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆84Updated 8 months ago
- Sparse and discrete interpretability tool for neural networks☆55Updated 9 months ago
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆56Updated 2 months ago
- We develop benchmarks and analysis tools to evaluate the causal reasoning abilities of LLMs.☆98Updated 5 months ago
- ☆37Updated this week
- Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".☆155Updated 6 months ago
- ☆74Updated 3 weeks ago
- A banchmark list for evaluation of large language models.☆68Updated 4 months ago
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆63Updated last year
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆87Updated last year
- ☆83Updated last year
- ☆250Updated 5 months ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆84Updated 8 months ago
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆44Updated 5 months ago