lechmazur / generalizationView on GitHub
Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a small set of examples and anti-examples, then detect which item truly fits that theme among a collection of misleading candidates.
65Apr 8, 2026Updated this week

Alternatives and similar repositories for generalization

Users that are interested in generalization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?