allenai / WildBenchView on GitHub
Benchmarking LLMs with Challenging Tasks from Real Users
246Nov 3, 2024Updated last year

Alternatives and similar repositories for WildBench

Users that are interested in WildBench are comparing it to the libraries listed below

Sorting:

Are these results useful?