sail-sg / Cheating-LLM-Benchmarks

[SafeGenAi @ NeurIPS 2024] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
59Updated 2 weeks ago

Related projects

Alternatives and complementary repositories for Cheating-LLM-Benchmarks