thu-nics / MARSHALLinks
MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs
☆34Updated last week
Alternatives and similar repositories for MARSHAL
Users that are interested in MARSHAL are comparing it to the libraries listed below
Sorting:
- The original Shared Recurrent Memory Transformer implementation☆33Updated 6 months ago
- ☆45Updated 7 months ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Updated 3 months ago
- ☆29Updated last month
- From Word to World: Can Large Language Models be Implicit Text-based World Models?☆36Updated last month
- Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning☆57Updated 3 months ago
- ☆75Updated 2 months ago
- ☆67Updated 10 months ago
- ☆27Updated last year
- ☆29Updated 10 months ago
- Verlog: A Multi-turn RL framework for LLM agents☆67Updated 2 weeks ago
- AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents☆36Updated 3 months ago
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Updated 7 months ago
- When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought☆26Updated 2 months ago
- LLM as World Models using Bayesian inference☆16Updated 8 months ago
- ☁️ KUMO: Generative Evaluation of Complex Reasoning in Large Language Models☆19Updated 7 months ago
- ☆64Updated 3 months ago
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆174Updated 4 months ago
- [EMNLP 2025 Main] AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time☆89Updated 7 months ago
- ☆21Updated 4 months ago
- ☆118Updated 9 months ago
- ☆50Updated 11 months ago
- [NeurIPS 2025] A multimodal agent that can interact with its own PC in a multimodal manner.☆36Updated 2 months ago
- We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench show…☆49Updated 3 weeks ago
- Natural Language Reinforcement Learning☆101Updated 6 months ago
- Python library for solving reinforcement learning (RL) problems using generative models.☆11Updated 11 months ago
- The code for paper "EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning"☆36Updated 3 months ago
- PreAct: Prediction Enhances Agent's Planning Ability (Coling2025)☆30Updated last year
- THOUGHTSCULPT, a general reasoning and search method for complex tasks☆13Updated last year
- 🕵 Code for our EMNLP 2025 Main paper: "FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games"☆23Updated last month