Farama-Foundation / miniwob-plusplusLinks

MiniWoB++: a web interaction benchmark for reinforcement learning

☆331

Alternatives and similar repositories for miniwob-plusplus

Users that are interested in miniwob-plusplus are comparing it to the libraries listed below

Sorting:

allenai / ScienceWorld
ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.
☆280Updated 3 weeks ago
princeton-nlp / WebShop
[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
☆379Updated 11 months ago
SwiftSage / SwiftSage
SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks
☆311Updated 9 months ago
flowersteam / lamorel
Lamorel is a Python library designed for RL practitioners eager to use Large Language Models (LLMs).
☆236Updated 9 months ago
microsoft / SmartPlay
SmartPlay is a benchmark for Large Language Models (LLMs). Uses a variety of games to test various important LLM capabilities as agents. …
☆140Updated last year
flowersteam / Grounding_LLMs_with_online_RL
We perform functional grounding of LLMs' knowledge in BabyAI-Text
☆268Updated 11 months ago
alfworld / alfworld
ALFWorld: Aligning Text and Embodied Environments for Interactive Learning
☆501Updated 2 weeks ago
web-arena-x / visualwebarena
VisualWebArena is a benchmark for multimodal agents.
☆366Updated 9 months ago
posgnu / rci-agent
A codebase for "Language Models can Solve Computer Tasks"
☆234Updated last year
karthikv792 / LLMs-Planning
An extensible benchmark for evaluating large language models on planning
☆393Updated last month
CraftJarvis / MC-Planner
Implementation of "Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agen…
☆282Updated 2 years ago
haotiansun14 / AdaPlanner
AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedback
☆112Updated 4 months ago
ServiceNow / WorkArena
WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
☆200Updated last week
Cranial-XIX / llm-pddl
☆419Updated last year
stanfordnlp / wge
Workflow-Guided Exploration: sample-efficient RL agent for web tasks
☆114Updated 2 years ago
ShengranHu / Thought-Cloning
[NeurIPS '23 Spotlight] Thought Cloning: Learning to Think while Acting by Imitating Human Thinking
☆268Updated last year
DigiRL-agent / digirl
Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.
☆372Updated 5 months ago
cooelf / Auto-GUI
Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (Findings of ACL 2024)
☆244Updated last year
princeton-nlp / intercode
[NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898
☆223Updated last year
StonyBrookNLP / appworld
🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Pap…
☆232Updated 2 months ago
jlin816 / dynalang
Code for "Learning to Model the World with Language." ICML 2024 Oral.
☆387Updated last year
abdulhaim / LMRL-Gym
☆99Updated last year
minaek / reward_design_with_llms
☆220Updated 2 years ago
Ber666 / RAP
Reasoning with Language Model is Planning with World Model
☆168Updated last year
BladeTransformerLLC / OvercookedGPT
An OpenAI gym environment to evaluate the ability of LLMs (eg. GPT-4, Claude) in long-horizon reasoning and task planning in dynamic mult…
☆69Updated 2 years ago
OSU-NLP-Group / Mind2Web
[NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist w…
☆853Updated 4 months ago
DeckardAgent / deckard
Official implementation of the DECKARD Agent from the paper "Do Embodied Agents Dream of Pixelated Sheep?"
☆94Updated 2 years ago
agentification / RAFA_code
☆143Updated last year
hkust-nlp / AgentBoard
An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]
☆335Updated last year
web-arena-x / webarena
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
☆1,084Updated 6 months ago