keyonvafa / world-model-evaluationLinks
☆55Updated 7 months ago
Alternatives and similar repositories for world-model-evaluation
Users that are interested in world-model-evaluation are comparing it to the libraries listed below
Sorting:
- A programming language for formal/informal computation.☆41Updated 2 months ago
- ☆38Updated 11 months ago
- Code for minimum-entropy coupling.☆32Updated last year
- gzip Predicts Data-dependent Scaling Laws☆35Updated last year
- Dataset and benchmark for assessing LLMs in translating natural language descriptions of planning problems into PDDL☆52Updated 8 months ago
- Probabilistic programming with large language models☆121Updated 2 weeks ago
- Generative cellular automaton-like learning environments for RL.☆19Updated 4 months ago
- lossily compress representation vectors using product quantization☆57Updated 2 months ago
- Sparse and discrete interpretability tool for neural networks☆63Updated last year
- ☆97Updated 4 months ago
- CausalGym: Benchmarking causal interpretability methods on linguistic tasks☆43Updated 6 months ago
- Portfolio REgret for Confidence SEquences☆20Updated 6 months ago
- ☆60Updated 3 years ago
- ☆26Updated 2 years ago
- Evaluation of neuro-symbolic engines☆35Updated 10 months ago
- Learning Universal Predictors☆76Updated 10 months ago
- ☆28Updated last year
- Experiments for efforts to train a new and improved t5☆77Updated last year
- ☆85Updated 5 months ago
- OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code (ICLR 2025).☆59Updated 5 months ago
- ☆53Updated last year
- Synthetic data generation and benchmark implementation for "Episodic Memories Generation and Evaluation Benchmark for Large Language Mode…☆45Updated 2 months ago
- ☆52Updated last year
- Collection of LLM completions for reasoning-gym task datasets☆24Updated last month
- A domain-specific probabilistic programming language for modeling and inference with language models☆131Updated last month
- ☆134Updated 2 months ago
- ☆37Updated 9 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆41Updated last year
- Language of thought library for python 3☆49Updated last year
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆32Updated 2 months ago