research-outcome / LLM-Game-BenchmarkLinks
Evaluating Large Language Models with Grid-Based Game Competitions: An Extensible LLM Benchmark and Leaderboard
☆21Updated last year
Alternatives and similar repositories for LLM-Game-Benchmark
Users that are interested in LLM-Game-Benchmark are comparing it to the libraries listed below
Sorting:
- Code for the paper: "Learning to Reason without External Rewards"☆383Updated 5 months ago
- A RL Framework for multi LLM agent system☆80Updated last week
- A Framework for LLM-based Multi-Agent Reinforced Training and Inference☆373Updated last month
- [Preprint] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments☆159Updated last month
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆254Updated 7 months ago
- The official implementation of the paper "Mem-α: Learning Memory Construction via Reinforcement Learning"☆116Updated last week
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆171Updated 3 months ago
- ☆319Updated 6 months ago
- A Gym for Agentic LLMs☆404Updated last month
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆115Updated 4 months ago
- Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"☆270Updated 2 months ago
- Reproducing R1 for Code with Reliable Rewards☆278Updated 7 months ago