Junjie-Ye / RoTBench

RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning
12Updated 5 months ago

Related projects: