snap-stanford / MLAgentBenchLinks
☆297Updated last year
Alternatives and similar repositories for MLAgentBench
Users that are interested in MLAgentBench are comparing it to the libraries listed below
Sorting:
- A banchmark list for evaluation of large language models.☆130Updated 2 weeks ago
- An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]☆329Updated last year