aogara-ds / hoodwinkedLinks
Text-based game of lies and deceit, made for language models.
☆32Updated 2 years ago
Alternatives and similar repositories for hoodwinked
Users that are interested in hoodwinked are comparing it to the libraries listed below
Sorting:
- ☆86Updated last year
- Official code for the paper "ADaPT: As-Needed Decomposition and Planning with Language Models"☆88Updated last year
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆28Updated 2 years ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 9 months ago
- A new way to generate large quantities of high quality synthetic data (on par with GPT-4), with better controllability, at a fraction of …☆23Updated last year
- Memoria is a human-inspired memory architecture for neural networks.☆76Updated last year
- Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs☆50Updated last year
- Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents☆127Updated last year
- ☆134Updated last year
- Scripts for generating synthetic finetuning data for reducing sycophancy.☆116Updated 2 years ago
- The Next Generation Multi-Modality Superintelligence☆69Updated last year
- Measuring the situational awareness of language models☆38Updated last year
- Functional Benchmarks and the Reasoning Gap☆89Updated last year
- ☆35Updated 2 years ago
- Repository for the paper Stream of Search: Learning to Search in Language☆151Updated 8 months ago
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆69Updated 2 years ago
- A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you…☆78Updated 10 months ago
- ☆63Updated last year
- Plug in and Play implementation of "Certified Reasoning with Language Models" that elevates model reasoning by 40%☆15Updated 2 years ago
- [NeurIPS 2024] GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations☆66Updated last year
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆63Updated 10 months ago
- The official implementation of Self-Exploring Language Models (SELM)☆64Updated last year
- ☆78Updated 2 years ago
- Generative Agents: Interactive Simulacra of Human Behavior☆102Updated 2 years ago
- Generate High Quality textual or multi-modal datasets with Agents☆17Updated 2 years ago
- ☆41Updated last year
- ☆31Updated last year
- ☆55Updated 11 months ago
- Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.☆56Updated 5 months ago
- ☆15Updated last year