Halluminate / WebBenchLinks
📚 Benchmark your browser agent on ~2.5k READ and ACTION based tasks
☆85Updated 6 months ago
Alternatives and similar repositories for WebBench
Users that are interested in WebBench are comparing it to the libraries listed below
Sorting:
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆90Updated last month
- Routing on Random Forest (RoRF)☆239Updated last year
- A better way of testing, inspecting, and analyzing AI Agent traces.☆46Updated 3 weeks ago
- ☆87Updated last year
- ☆94Updated last year
- 🤖 Headless IDE for AI agents☆200Updated 3 months ago
- ☆140Updated 11 months ago
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆61Updated 9 months ago
- Deprecated Browserbase Python SDK☆10Updated last year
- Agent computer interface for AI software engineer.☆116Updated 2 months ago
- Run Surfer-H agents powered by Holo1 using the Surfer-H-CLI. Includes example tasks, scripts, and configurations.☆146Updated last month
- The theory of mind module for the SWE agent☆73Updated 3 weeks ago
- The Showdown Computer Control Evaluation Suite☆93Updated 10 months ago
- Anthropic Computer Use with Modal Sandboxes☆43Updated last year
- An open-source debugging agent in VSCode☆88Updated last year
- Testing and evaluation framework for voice agents☆162Updated 8 months ago
- A toolkit for building computer use AI agents☆182Updated 7 months ago
- Sphynx Hallucination Induction☆53Updated last year
- ☆126Updated 4 months ago
- ☆107Updated 3 months ago
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆424Updated last week
- Open Agent Computer Interface☆92Updated last year
- Scrapybara Python SDK☆73Updated 5 months ago
- a Python library that uses Reinforcement Learning (RL) to train LLMs.☆42Updated 6 months ago
- ☆85Updated 5 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆99Updated 4 months ago
- [NAACL2025] LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications☆143Updated 6 months ago
- Grapheteria: A structured framework bringing uniformity to agent orchestration!☆60Updated 7 months ago
- Challenges for general-purpose web-browsing AI agents☆67Updated 8 months ago
- 🔌 Want one client library for all your embeddings? 💙 Choose Catsu! 🐱☆57Updated 3 weeks ago