wesg52 / world-modelsLinks
Extracting spatial and temporal world models from LLMs
☆255Updated last year
Alternatives and similar repositories for world-models
Users that are interested in world-models are comparing it to the libraries listed below
Sorting:
- Tools for understanding how transformer predictions are built layer-by-layer☆500Updated last year
- Learning to Compress Prompts with Gist Tokens - https://arxiv.org/abs/2304.08467☆289Updated 4 months ago
- Interpretable text embeddings by asking LLMs yes/no questions (NeurIPS 2024)☆37Updated 7 months ago
- Emergent world representations: Exploring a sequence model trained on a synthetic task☆181Updated last year
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆202Updated 6 months ago
- ☆120Updated 10 months ago
- ☆132Updated 7 months ago
- ☆288Updated last year
- Function Vectors in Large Language Models (ICLR 2024)☆170Updated 2 months ago
- Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedback☆207Updated 2 years ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆317Updated 7 months ago
- ☆270Updated last year
- Scaling Data-Constrained Language Models☆335Updated 9 months ago
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆184Updated this week
- ☆99Updated 4 months ago
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆219Updated 6 months ago
- ☆207Updated last year
- Mass-editing thousands of facts into a transformer memory (ICLR 2023)☆500Updated last year
- ☆294Updated last year
- This repository collects all relevant resources about interpretability in LLMs☆359Updated 7 months ago
- ☆95Updated 4 months ago
- ☆69Updated last year
- ☆67Updated 2 years ago
- Inspecting and Editing Knowledge Representations in Language Models☆116Updated last year
- Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)☆224Updated this week
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆190Updated last year
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆127Updated 2 years ago
- ☆495Updated 11 months ago
- ☆180Updated last year
- Code and data for the paper "Why think step by step? Reasoning emerges from the locality of experience"☆60Updated 2 months ago