nottelabs / open-operator-evalsLinks
Opensource benchmark evaluating web operators/agents performance
☆44Updated 6 months ago
Alternatives and similar repositories for open-operator-evals
Users that are interested in open-operator-evals are comparing it to the libraries listed below
Sorting:
- Vibe-coding tools for the LlamaIndex ecosystem☆161Updated last week
- Turn topics, links, and files into AI-generated research notebooks — summarize, explore, and ask anything.☆141Updated 4 months ago
- Context Engineering Course with DSPy☆194Updated 2 months ago
- A toolkit for building computer use AI agents☆176Updated 3 months ago
- AI-driven web automation agent that uses Playwright for browser interactions and LLM integration for intelligent decision-making. It's de…☆152Updated 4 months ago
- ☆55Updated last month
- ☆89Updated 5 months ago
- An assistant for Slack built with Arcade and Langgraph. Interact with Google Calendar, Mail, Github, Search Engines, Firecrawl and more a…☆109Updated 4 months ago
- A MCP server connecting to managed indexes on LlamaCloud☆81Updated 3 months ago
- A list of AI memory projects☆234Updated 9 months ago
- The Showdown Computer Control Evaluation Suite☆87Updated 6 months ago
- An OpenSource Deep Research library with reasoning☆161Updated last month
- Run Surfer-H agents powered by Holo1 using the Surfer-H-CLI. Includes example tasks, scripts, and configurations.☆136Updated 3 weeks ago
- ☆73Updated last year
- A Generative UI app for interacting with Computer Use Agents☆207Updated 6 months ago
- ☆112Updated 2 months ago
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.☆135Updated last month
- Model Context Protocol (MCP) Server for Langfuse Prompt Management. This server allows you to access and manage your Langfuse prompts thr…☆140Updated 8 months ago
- An Automatic Prompt Optimization Framework for Large Language Models☆126Updated 2 months ago
- ☆49Updated 10 months ago
- A better way of testing, inspecting, and analyzing AI Agent traces.☆40Updated 3 weeks ago
- Terminal-based AI Coding Agent, similar to Claude Code, OpenAI Codex etc. but works with many more LLMs e.g. Gemini, Groq, Deepseek☆148Updated 5 months ago
- ☆181Updated 8 months ago
- ☆23Updated 11 months ago
- An AI Clone For Any X Profile☆88Updated 10 months ago
- MCP server(s) for Aipolabs ACI.dev☆246Updated 2 months ago
- ☆88Updated 8 months ago
- Open-source resources on agents for computer use.☆377Updated last week
- AI agent with RAG+ReAct on Indian Constitution & BNS☆75Updated 3 months ago
- Python examples using gotoHuman and LangGraph☆45Updated 7 months ago