google-deepmind / game_arenaLinks
☆85Updated last week
Alternatives and similar repositories for game_arena
Users that are interested in game_arena are comparing it to the libraries listed below
Sorting:
- Evaluation of LLMs on latest math competitions☆216Updated this week
- ☆93Updated last month
- A Gym for Agentic LLMs☆444Updated 3 weeks ago
- [Preprint] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments☆177Updated last month
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆216Updated 2 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆189Updated 11 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆175Updated last year
- ☆123Updated 11 months ago
- Open-source release accompanying Gao et al. 2025☆501Updated 2 months ago
- Extending the Context of Pretrained LLMs by Dropping Their Positional Embedding☆203Updated 3 weeks ago
- ☆148Updated last week
- ☆41Updated 10 months ago
- ☆123Updated last week
- Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike stat…☆427Updated 2 weeks ago
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆358Updated 7 months ago
- Replicating O1 inference-time scaling laws☆93Updated last year
- Physics of Language Models: Part 4.2, Canon Layers at Scale where Synthetic Pretraining Resonates in Reality☆317Updated last month
- Simple & Scalable Pretraining for Neural Architecture Research☆308Updated 2 months ago
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆350Updated last week
- ☆388Updated 3 months ago
- Code for paper "Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System"☆69Updated last year
- ☆102Updated last month
- [NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond☆191Updated 7 months ago
- Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"☆273Updated 3 months ago
- ☆90Updated 3 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆153Updated last year
- A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning☆283Updated 4 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆115Updated 9 months ago
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆292Updated 2 months ago
- rl from zero pretrain, can it be done? yes.☆286Updated 4 months ago