A framework for few-shot evaluation of autoregressive language models.
☆12Jul 14, 2025Updated 7 months ago
Alternatives and similar repositories for lm-evaluation-harness
Users that are interested in lm-evaluation-harness are comparing it to the libraries listed below
Sorting:
- GPI-Space: Memory Driven Computing and Big Data☆10Jan 2, 2025Updated last year
- Self-evaluating RAG application on LangCheck docs☆11Sep 10, 2025Updated 5 months ago
- ☆10Jul 6, 2023Updated 2 years ago
- Demo repository showcasing how to use reusable workflows to build artifact attestations☆14Feb 16, 2026Updated 2 weeks ago
- Eine Markdown-Version der gemeinfreien Menge-Bibel☆11Nov 2, 2021Updated 4 years ago
- ☆17Sep 10, 2025Updated 5 months ago
- A Python client library for accessing IQM quantum computers☆12Mar 26, 2025Updated 11 months ago
- Simple getting started procedure for SciCat☆11Updated this week
- gammcor code☆11Sep 25, 2025Updated 5 months ago
- Amplify your coding capabilities with AI - your smart co-pilot for an elevated coding experience.☆14Feb 18, 2026Updated 2 weeks ago
- [CVPR2024] Learning from Synthetic Human Group Activities