mrconter1 / BenchmarkAggregatorLinks

Comprehensive LLM evaluation framework: GPQA Diamond to Chatbot Arena. Tests all major models equally, easily extensible.
16Updated 9 months ago

Alternatives and similar repositories for BenchmarkAggregator

Users that are interested in BenchmarkAggregator are comparing it to the libraries listed below

Sorting: