google-deepmind / pix2act
☆49Updated 8 months ago
Related projects: ⓘ
- 🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Pap…☆81Updated last month
- [ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Control☆48Updated 3 weeks ago
- ☆34Updated last month
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆55Updated last week
- ☆74Updated 9 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents☆81Updated last week
- Self-Alignment with Principle-Following Reward Models☆144Updated 6 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆30Updated last month
- augmented LLM with self reflection☆80Updated 9 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆70Updated last month
- A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use☆106Updated 5 months ago
- ToolBench, an evaluation suite for LLM tool manipulation capabilities.☆134Updated 6 months ago
- Code and Data for Tau-Bench☆91Updated this week
- Can Language Models Solve Olympiad Programming?☆92Updated last month
- WebLINX is a benchmark for building web navigation agents with conversational capabilities☆111Updated 2 months ago
- Official code for the paper "ADaPT: As-Needed Decomposition and Planning with Language Models"☆69Updated 8 months ago
- ☆73Updated last year
- DialOp: Decision-oriented dialogue environments for collaborative language agents☆97Updated 2 months ago
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆84Updated 11 months ago
- GUICourse: From General Vision Langauge Models to Versatile GUI Agents☆68Updated 2 months ago
- Code for the paper 🌳 Tree Search for Language Model Agents☆124Updated last month
- CodeUltraFeedback: aligning large language models to coding preferences☆62Updated 2 months ago
- AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedback☆82Updated last year
- ☆101Updated 2 months ago
- VisualWebArena is a benchmark for multimodal agents.☆211Updated last month
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆45Updated 6 months ago
- ☆40Updated 4 months ago
- ☆87Updated 2 months ago
- ☆131Updated 4 months ago
- Towards Large Multimodal Models as Visual Foundation Agents☆87Updated 3 weeks ago