kohjingyu / search-agents
Code for the paper π³ Tree Search for Language Model Agents
β167Updated 6 months ago
Alternatives and similar repositories for search-agents:
Users that are interested in search-agents are comparing it to the libraries listed below
- π Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Papβ¦β136Updated last month
- AWM: Agent Workflow Memoryβ233Updated 2 months ago
- Aguvis: Unified Pure Vision Agents for Autonomous GUI Interactionβ185Updated 2 weeks ago
- β120Updated 7 months ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gymβ230Updated 2 weeks ago
- VisualWebArena is a benchmark for multimodal agents.β283Updated 2 months ago
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reβ¦β208Updated this week
- An Analytical Evaluation Board of Multi-turn LLM Agentsβ272Updated 8 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.β158Updated 2 weeks ago
- UGround: Universal GUI Visual Grounding for GUI Agentsβ147Updated this week
- β38Updated 6 months ago
- β110Updated 5 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]β111Updated 2 months ago
- [NeurIPS 2022] πWebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agentsβ302Updated 4 months ago
- Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.β298Updated 2 months ago
- A banchmark list for evaluation of large language models.β79Updated 6 months ago
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898β203Updated 8 months ago
- Repository for the paper Stream of Search: Learning to Search in Languageβ125Updated 5 months ago
- WebLINX is a benchmark for building web navigation agents with conversational capabilitiesβ134Updated last month
- Benchmarking LLMs with Challenging Tasks from Real Usersβ208Updated 2 months ago
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examplesβ58Updated last week
- Towards Large Multimodal Models as Visual Foundation Agentsβ167Updated last month
- Code and example data for the paper: Rule Based Rewards for Language Model Safetyβ176Updated 6 months ago
- β48Updated last month
- Building a comprehensive and handy list of papers for GUI agentsβ193Updated last week
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)β127Updated 2 months ago
- β142Updated last week
- β98Updated this week
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'β176Updated last month
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agentsβ257Updated 2 weeks ago