balrog-ai/BALROG

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/balrog-ai/BALROG)

balrog-ai / BALROG

Benchmarking Agentic LLM and VLM Reasoning On Games

☆261

Alternatives and similar repositories for BALROG

Users that are interested in BALROG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

facebookresearch / sol
View on GitHub
Scalable Option Learning
☆23Updated this week
NetHack-LE / nle
View on GitHub
The NetHack Learning Environment
☆135Jul 17, 2026Updated last week
alexzhang13 / videogamebench
View on GitHub
Benchmark environment for evaluating vision-language models (VLMs) on popular video games!
☆364May 30, 2025Updated last year
upiterbarg / diff_history
View on GitHub
[ICML 2024] Official code release accompanying the paper "diff History for Neural Language Agents" (Piterbarg, Pinto, Fergus)
☆20Aug 20, 2024Updated last year
TextArena / TextArena
View on GitHub
A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning
☆411Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
MichaelTMatthews / Craftax
View on GitHub
(Crafter + NetHack) in JAX. ICML 2024 Spotlight.
☆426Jun 20, 2026Updated last month
FLAIROx / cultural-accumulation
View on GitHub
☆16Jul 16, 2024Updated 2 years ago
conglu1997 / intelligent-go-explore
View on GitHub
Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models
☆69Apr 16, 2026Updated 3 months ago
ngoodger / nle-language-wrapper
View on GitHub
Nethack Learning Environment Wrapper for Language Interface
☆43Sep 11, 2023Updated 2 years ago
yunfeixie233 / ViGaL
View on GitHub
☆70Feb 4, 2026Updated 5 months ago
maciej-sypetkowski / autoascend
View on GitHub
The first place solution for the NeurIPS 2021 Nethack Challenge -- https://www.aicrowd.com/challenges/neurips-2021-the-nethack-challenge
☆64Jan 3, 2023Updated 3 years ago
ucl-dark / skillhack
View on GitHub
SkillHack: A Benchmark for Skill Transfer in Open-Ended Reinforcement Learning
☆17Oct 23, 2022Updated 3 years ago
upiterbarg / lintseq
View on GitHub
[ICLR 2025] "Training LMs on Synthetic Edit Sequences Improves Code Synthesis" (Piterbarg, Pinto, Fergus)
☆19Feb 11, 2025Updated last year
google-deepmind / nao_top10
View on GitHub
☆19Mar 1, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
FLAIROx / JaxGL
View on GitHub
Simple JAX Graphics Library.
☆38Nov 3, 2024Updated last year
Miffyli / nle-sample-factory-baseline
View on GitHub
☆22Mar 28, 2025Updated last year
flowersteam / vivarium
View on GitHub
Multi-agent simulator in Jax for research and teaching in AI & ALife
☆31Apr 11, 2026Updated 3 months ago
upiterbarg / hihack
View on GitHub
[NeurIPS 2023] Official code release accompanying the paper "NetHack is Hard to Hack" (Piterbarg, Pinto, Fergus)
☆13Oct 30, 2023Updated 2 years ago
lmgame-org / GamingAgent
View on GitHub
[ICLR 2026] LLM/VLM gaming agents and model evaluation through games.
☆953Nov 16, 2025Updated 8 months ago
smearle / script-doctor
View on GitHub
Code for PuzzleJAX, a benchmark for reasoning and learning, that reimplements PuzzleScript, a concise and expressive DSL and game engine …
☆29Jun 30, 2026Updated 3 weeks ago
ucl-dark / pax
View on GitHub
Scalable Opponent Shaping Experiments in JAX
☆27Apr 13, 2024Updated 2 years ago
WentseChen / Verlog
View on GitHub
Verlog: A Multi-turn RL framework for LLM agents
☆73Apr 28, 2026Updated 2 months ago
pdfosborne / elsciRL
View on GitHub
The core repository of the elsciRL framework.
☆18Dec 8, 2025Updated 7 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
multimodal-art-projection / KORGym
View on GitHub
☆60May 21, 2025Updated last year
jrosseruk / AgentBreeder
View on GitHub
[NeurIPS 2025 spotlight] Mitigating the AI Safety Impact of Multi-Agent Scaffolds
☆19Sep 22, 2025Updated 10 months ago
CommanderCero / NetPlay
View on GitHub
A LLM-powered agent for NetHack
☆23Nov 4, 2024Updated last year
open-thought / reasoning-gym
View on GitHub
[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards
☆1,464Apr 17, 2026Updated 3 months ago
nacloos / baba-is-ai
View on GitHub
Code for "Baba Is AI: Break the Rules to Beat the Benchmark"
☆49Sep 3, 2025Updated 10 months ago
allenai / ScienceWorld
View on GitHub
ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.
☆368Dec 3, 2025Updated 7 months ago
krafton-ai / Orak
View on GitHub
☆153Jun 17, 2025Updated last year
MichalBortkiewicz / JaxGCRL
View on GitHub
Online Goal-Conditioned Reinforcement Learning in JAX. ICLR 2025 Spotlight.
☆273Jun 6, 2026Updated last month
MichaelTMatthews / purejaxgcrl
View on GitHub
GCRL in JAX. Official repository for LEO (ICML 2026).
☆28Jun 20, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
conglu1997 / v-d4rl
View on GitHub
Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations
☆115Apr 16, 2026Updated 3 months ago
yuanjiayiy / InvestESG
View on GitHub
☆15Aug 1, 2025Updated 11 months ago
chenllliang / G1
View on GitHub
G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
☆103May 20, 2025Updated last year
jennyzzt / awesome-open-ended
View on GitHub
Awesome Open-ended AI
☆458Jun 19, 2026Updated last month
DramaCow / jaxued
View on GitHub
☆98Jan 21, 2026Updated 6 months ago
mklissa / maestromotif
View on GitHub
Skill Design From AI Feedback
☆33Feb 27, 2025Updated last year
aronvallinder / llm-donor-game
View on GitHub
☆12Sep 24, 2025Updated 10 months ago