RAIVNLab / mnms

m&ms: A Benchmark to Evaluate Tool-Use for multi-step multi-modal tasks
30Updated 5 months ago

Related projects: