wesg52 / world-models
Extracting spatial and temporal world models from LLMs
☆255Updated last year
Alternatives and similar repositories for world-models
Users that are interested in world-models are comparing it to the libraries listed below
Sorting:
- ☆288Updated 10 months ago
- ☆265Updated last year
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆156Updated this week
- Representation Engineering: A Top-Down Approach to AI Transparency☆828Updated 9 months ago
- Function Vectors in Large Language Models (ICLR 2024)☆166Updated last month
- Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)☆214Updated this week
- Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"☆484Updated 4 months ago
- Emergent world representations: Exploring a sequence model trained on a synthetic task☆181Updated last year
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆191Updated 5 months ago
- ☆114Updated 9 months ago
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆201Updated 5 months ago
- Tools for understanding how transformer predictions are built layer-by-layer☆490Updated 11 months ago
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …☆172Updated this week
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆307Updated 5 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆146Updated 3 months ago
- An extensible benchmark for evaluating large language models on planning☆361Updated 3 weeks ago
- Sparsify transformers with SAEs and transcoders☆526Updated this week
- This repository collects all relevant resources about interpretability in LLMs☆343Updated 6 months ago
- RewardBench: the first evaluation tool for reward models.☆566Updated last week
- ICML 2024: Improving Factuality and Reasoning in Language Models through Multiagent Debate☆432Updated 3 weeks ago
- Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedback☆206Updated last year
- ☆142Updated last year
- Using sparse coding to find distributed representations used by neural networks.☆242Updated last year
- Meta-Learning for Compositionality (MLC) for modeling human behavior☆141Updated last year
- This repository contains a collection of papers and resources on Reasoning in Large Language Models.☆564Updated last year
- Scaling Data-Constrained Language Models☆334Updated 7 months ago
- SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks☆308Updated 6 months ago
- ☆267Updated 3 months ago
- Evaluating LLMs with fewer examples☆153Updated last year
- ☆257Updated last year