☆111Jul 2, 2024Updated last year
Alternatives and similar repositories for LMRL-Gym
Users that are interested in LMRL-Gym are comparing it to the libraries listed below
Sorting:
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆202Apr 17, 2025Updated 10 months ago
- ☆67Mar 6, 2025Updated last year
- Plan✕ is a platform for creating and publishing digital planning services☆17Updated this week
- Detect-Then-Explain Framework for Text-to-SQL task☆10Dec 6, 2023Updated 2 years ago
- ☆18May 3, 2025Updated 10 months ago
- Public code release for the paper "Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training"☆11Oct 27, 2025Updated 4 months ago
- This repository contains the code and data for the paper "VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception o…☆28Jul 9, 2025Updated 7 months ago
- [NeurIPS25] RULE: Reinforcement UnLEarning Achieves Forge-retain Pareto Optimality☆19Oct 22, 2025Updated 4 months ago
- Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhi…☆737Sep 11, 2025Updated 5 months ago
- A benchmark for evaluating learning agents based on just language feedback☆94Jun 10, 2025Updated 8 months ago
- [EMNLP 2023] Official repository for Dialogue Chain-of-Thought Distillation (DONUT & DOCTOR)☆11Nov 15, 2023Updated 2 years ago
- ☆15Aug 18, 2022Updated 3 years ago
- ☆17Jun 9, 2024Updated last year
- Code for "End-to-End Learning of Flowchart Grounded Task-Oriented Dialogs"☆14Oct 10, 2022Updated 3 years ago
- ☆17Apr 7, 2025Updated 10 months ago
- [NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents☆496Sep 6, 2024Updated last year
- ESM2 protein language models in JAX/Flax☆18Oct 10, 2022Updated 3 years ago
- ☆14Jul 12, 2021Updated 4 years ago
- ☆16Jul 20, 2023Updated 2 years ago
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆177Sep 18, 2025Updated 5 months ago
- Official repository of DialSim☆29Oct 31, 2025Updated 4 months ago
- ☆17Dec 21, 2023Updated 2 years ago
- The code of paper "Toward Optimal LLM Alignments Using Two-Player Games".☆17Jun 20, 2024Updated last year
- [ACL2024 Findings]DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling☆18Jun 6, 2024Updated last year
- Source codes for "Preference-grounded Token-level Guidance for Language Model Fine-tuning" (NeurIPS 2023).☆17Jan 8, 2025Updated last year
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆261May 5, 2025Updated 10 months ago
- GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators☆47Dec 23, 2025Updated 2 months ago
- Efficient Real-World RL for Legged Locomotion via Adaptive Policy Regularization☆82Nov 1, 2023Updated 2 years ago
- ☆89Aug 21, 2023Updated 2 years ago
- Benchmarking Agentic LLM and VLM Reasoning On Games☆232Feb 10, 2026Updated 3 weeks ago
- [ICML'24 Spotlight] "TravelPlanner: A Benchmark for Real-World Planning with Language Agents"☆482Nov 7, 2025Updated 3 months ago
- A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environm…☆43Sep 19, 2022Updated 3 years ago
- Personal implementation of ASIF by Antonio Norelli☆26May 24, 2024Updated last year
- Official code for the paper "ADaPT: As-Needed Decomposition and Planning with Language Models"☆90Jan 3, 2024Updated 2 years ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆148Nov 26, 2024Updated last year
- ☆18Apr 17, 2019Updated 6 years ago
- ☆26Mar 19, 2025Updated 11 months ago
- [ICLR 2025] Language Imbalance Driven Rewarding for Multilingual Self-improving☆24Aug 25, 2025Updated 6 months ago
- Momentum Decoding: Open-ended Text Generation as Graph Exploration☆19Jan 27, 2023Updated 3 years ago