XiaojuanTang / MarsLinks
a benchmark to evaluate the situated inductive reasoning
☆16Updated 7 months ago
Alternatives and similar repositories for Mars
Users that are interested in Mars are comparing it to the libraries listed below
Sorting:
- ☆110Updated 4 months ago
- Natural Language Reinforcement Learning☆95Updated last month
- ☆99Updated last year
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆140Updated this week
- ☆115Updated 7 months ago
- ☆61Updated 5 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆142Updated 9 months ago
- Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)☆75Updated last year
- ☆46Updated 6 months ago
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆63Updated 6 months ago
- SmartPlay is a benchmark for Large Language Models (LLMs). Uses a variety of games to test various important LLM capabilities as agents. …☆140Updated last year
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks☆81Updated 2 months ago
- GROOT: Learning to Follow Instructions by Watching Gameplay Videos (ICLR 2024 Spotlight)☆66Updated last year
- ☆94Updated last year
- Reinforcing General Reasoning without Verifiers☆80Updated 2 months ago
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" pr…☆108Updated last year
- ☆44Updated last year
- ☆83Updated 2 months ago
- A repo for open research on building large reasoning models☆92Updated last week
- The official implementation of Self-Exploring Language Models (SELM)☆64Updated last year
- ☆25Updated 2 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆104Updated last month
- "Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents"☆79Updated 4 months ago
- Benchmarking Agentic LLM and VLM Reasoning On Games☆188Updated 2 weeks ago
- ☆213Updated 6 months ago
- This code accompanies the paper "Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration."☆33Updated last month
- Code for the paper: "Learning to Reason without External Rewards"☆349Updated last month
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆189Updated 4 months ago
- Verlog: A Multi-turn RL framework for LLM agents☆35Updated 2 weeks ago
- ☆21Updated last month