keyonvafa / world-model-evaluationLinks
☆55Updated 6 months ago
Alternatives and similar repositories for world-model-evaluation
Users that are interested in world-model-evaluation are comparing it to the libraries listed below
Sorting:
- Collection of LLM completions for reasoning-gym task datasets☆22Updated last week
- Dataset and benchmark for assessing LLMs in translating natural language descriptions of planning problems into PDDL☆51Updated 7 months ago
- Repository for the code of the "PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided Decoding" paper, NAACL'22☆65Updated 2 years ago
- ☆19Updated last year
- Sparse and discrete interpretability tool for neural networks☆63Updated last year
- Code for minimum-entropy coupling.☆32Updated 11 months ago
- ☆38Updated 10 months ago
- ☆131Updated 2 months ago
- Understanding how features learned by neural networks evolve throughout training☆34Updated 7 months ago
- Evaluation of neuro-symbolic engines☆35Updated 10 months ago
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆84Updated 2 months ago
- Implementing RASP transformer programming language https://arxiv.org/pdf/2106.06981.pdf.☆53Updated 3 years ago
- Experiments for efforts to train a new and improved t5☆76Updated last year
- A programming language for formal/informal computation.☆41Updated last month
- OMNI: Open-endedness via Models of human Notions of Interestingness☆46Updated 4 months ago
- Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"☆22Updated last year
- ☆96Updated 3 months ago
- Training code for Sparse Autoencoders on Embedding models☆38Updated 3 months ago
- Building the cognitive-core to solve ARC-AGI-2☆21Updated 4 months ago
- lossily compress representation vectors using product quantization☆54Updated last month
- ARC gym: a data generation framework for the Abstraction & Reasoning Corpus☆22Updated 2 weeks ago
- Official repo for Learning to Reason for Long-Form Story Generation☆60Updated last month
- CausalGym: Benchmarking causal interpretability methods on linguistic tasks☆43Updated 6 months ago
- ☆68Updated 9 months ago
- gzip Predicts Data-dependent Scaling Laws☆35Updated last year
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆59Updated 7 months ago
- ☆26Updated 2 years ago
- Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.☆29Updated last month
- Code associated to papers on superposition (in ML interpretability)☆28Updated 2 years ago
- Learn online intrinsic rewards from LLM feedback☆37Updated 5 months ago