mrconter1 / BenchmarkAggregatorLinks

Comprehensive LLM evaluation framework: GPQA Diamond to Chatbot Arena. Tests all major models equally, easily extensible.
16Updated last year

Alternatives and similar repositories for BenchmarkAggregator

Users that are interested in BenchmarkAggregator are comparing it to the libraries listed below

Sorting: