scoootscooob / clawbenchView on GitHub
The agent benchmark that scores the full stack — harness, config, and model — not just the LLM. Trace-based scoring, reliability metrics, configuration diagnostics.
54Apr 14, 2026Updated this week

Alternatives and similar repositories for clawbench

Users that are interested in clawbench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?