Halluminate / WebBenchLinks
📚 Benchmark your browser agent on ~2.5k READ and ACTION based tasks
☆66Updated 2 months ago
Alternatives and similar repositories for WebBench
Users that are interested in WebBench are comparing it to the libraries listed below
Sorting:
- ☆104Updated 4 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆88Updated last month
- A better way of testing, inspecting, and analyzing AI Agent traces.☆40Updated 2 weeks ago
- 🤖 Headless IDE for AI agents☆200Updated last week
- Build AI Agents with Your Existing Python Code!☆67Updated 11 months ago
- A toolkit for building computer use AI agents☆176Updated 3 months ago
- Not Diamond Python SDK☆89Updated this week
- ☆34Updated 2 months ago
- Routing on Random Forest (RoRF)☆213Updated last year
- Embed anything.☆27Updated last year
- Anthropic Computer Use with Modal Sandboxes☆40Updated 11 months ago
- Efficient computer use agent powered by Meta Llama 4 Maverick☆45Updated 6 months ago
- MarinaBox is a toolkit for creating and managing secure, isolated environments for AI agents☆137Updated 7 months ago
- Letting Claude Code develop his own MCP tools :)☆123Updated 7 months ago
- ☆47Updated last year
- ☆84Updated 11 months ago
- ☆142Updated 7 months ago
- Make DSPy Agentic using protocol-first approach that support the Agent Protocols like MCP, A2A☆56Updated 4 months ago
- ☆113Updated 3 months ago
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆58Updated 5 months ago
- Managed Agent Posttraining☆48Updated this week
- Chat strategies for LLMs☆99Updated last year
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆69Updated last year
- ☆89Updated 9 months ago
- A framework for optimizing DSPy programs with RL☆202Updated this week
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆89Updated 2 weeks ago
- OpenAI GPT hosted Agent Framework for Windows and MacOS☆36Updated last year
- The Showdown Computer Control Evaluation Suite☆87Updated 6 months ago
- a Python library that uses Reinforcement Learning (RL) to train LLMs.☆42Updated 2 months ago
- An open-source debugging agent in VSCode☆79Updated last year