upiterbarg / diff_historyLinks

[ICML 2024] Official code release accompanying the paper "diff History for Neural Language Agents" (Piterbarg, Pinto, Fergus)

☆20

Alternatives and similar repositories for diff_history

Users that are interested in diff_history are comparing it to the libraries listed below

Sorting:

conglu1997 / intelligent-go-explore
Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models
☆65Updated 8 months ago
abdulhaim / LMRL-Gym
☆105Updated last year
upiterbarg / lintseq
[ICLR 2025] "Training LMs on Synthetic Edit Sequences Improves Code Synthesis" (Piterbarg, Pinto, Fergus)
☆19Updated 9 months ago
facebookresearch / motif
Intrinsic Motivation from Artificial Intelligence Feedback
☆132Updated 2 years ago
balrog-ai / BALROG
Benchmarking Agentic LLM and VLM Reasoning On Games
☆207Updated 3 months ago
jinpz / q_sharp
The official code release for Q#: Provably Optimal Distributional RL for LLM Post-Training
☆17Updated 8 months ago
facebookresearch / minimax
Efficient baselines for autocurricula in JAX.
☆201Updated last year
facebookresearch / oni
Learn online intrinsic rewards from LLM feedback
☆45Updated 11 months ago
mklissa / maestromotif
Skill Design From AI Feedback
☆32Updated 8 months ago
McGill-NLP / VinePPO
Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"
☆179Updated 5 months ago
vwxyzjn / cleanba
CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL
☆117Updated last year
WentseChen / Verlog
Verlog: A Multi-turn RL framework for LLM agents
☆64Updated 2 weeks ago
btnorman / First-Explore
Repo to reproduce the First-Explore paper results
☆38Updated 10 months ago
DHDev0 / Muzero-unplugged
Pytorch Implementation of MuZero Unplugged for gym environment. This algorithm is capable of supporting a wide range of action and observ…
☆34Updated 4 months ago
BladeTransformerLLC / OvercookedGPT
An OpenAI gym environment to evaluate the ability of LLMs (eg. GPT-4, Claude) in long-horizon reasoning and task planning in dynamic mult…
☆71Updated 2 years ago
FLAIROx / cultural-accumulation
☆15Updated last year
vmicheli / delta-iris
Efficient World Models with Context-Aware Tokenization. ICML 2024
☆113Updated last year
microsoft / SmartPlay
SmartPlay is a benchmark for Large Language Models (LLMs). Uses a variety of games to test various important LLM capabilities as agents. …
☆143Updated last year
maxencefaldor / omni-epic
OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code (ICLR 2025).
☆69Updated 10 months ago
Reytuag / transformerXL_PPO_JAX
☆87Updated last year
jennyzzt / omni
OMNI: Open-endedness via Models of human Notions of Interestingness
☆57Updated 9 months ago
agentification / RAFA_code
☆144Updated last year
flowersteam / lamorel
Lamorel is a Python library designed for RL practitioners eager to use Large Language Models (LLMs).
☆242Updated 3 weeks ago
dunnolab / xland-minigrid-datasets
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning - - — ICLR 2025
☆81Updated 9 months ago
luchris429 / discovered-policy-optimisation
Code for Discovered Policy Optimisation (NeurIPS 2022)
☆12Updated 2 years ago
Cornell-RL / tril
☆128Updated last year
kanishkg / stream-of-search
Repository for the paper Stream of Search: Learning to Search in Language
☆151Updated 9 months ago
andyljones / boardlaw
Scaling scaling laws with board games.
☆53Updated 2 years ago
rail-berkeley / SUPE
This code accompanies the paper "Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration."
☆35Updated 4 months ago
BatsResearch / planetarium
Dataset and benchmark for assessing LLMs in translating natural language descriptions of planning problems into PDDL
☆61Updated last year