lechmazur / generalizationLinks

Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a small set of examples and anti-examples, then detect which item truly fits that theme among a collection of misleading candidates.
57Updated this week

Alternatives and similar repositories for generalization

Users that are interested in generalization are comparing it to the libraries listed below

Sorting: