Yifan-Song793 / GoodBadGreedy

The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism
23Updated 2 months ago

Related projects: