lechmazur / generalizationLinks
Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a small set of examples and anti-examples, then detect which item truly fits that theme among a collection of misleading candidates.
☆63Updated 3 months ago
Alternatives and similar repositories for generalization
Users that are interested in generalization are comparing it to the libraries listed below
Sorting:
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆56Updated 11 months ago
- ☆24Updated 11 months ago
- fork of litellm that is open source☆21Updated last year
- Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure. A multi-player “step-race” that challenges LLM…☆82Updated last month
- ☆119Updated last year
- A Python library to orchestrate LLMs in a neural network-inspired structure