jayminban / 41-llms-evaluated-on-19-benchmarksView on GitHub
This project benchmarks 41 open-source large language models across 19 evaluation tasks using the lm-evaluation-harness library.
99Sep 5, 2025Updated 6 months ago

Alternatives and similar repositories for 41-llms-evaluated-on-19-benchmarks

Users that are interested in 41-llms-evaluated-on-19-benchmarks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?