amazon-agi / tau2-bench-verifiedView on GitHub
τ²-Bench-Verified is a corrected and verified version of the original τ²-bench benchmark. This release addresses issues discovered in the original dataset where task definitions, expected actions, and evaluation criteria did not properly align with the stated policies or database contents.
32Dec 15, 2025Updated 2 months ago

Alternatives and similar repositories for tau2-bench-verified

Users that are interested in tau2-bench-verified are comparing it to the libraries listed below

Sorting:

Are these results useful?