oriyor / assistantbenchLinks
Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"
☆59Updated 7 months ago
Alternatives and similar repositories for assistantbench
Users that are interested in assistantbench are comparing it to the libraries listed below
Sorting:
- ☆41Updated last year
- Functional Benchmarks and the Reasoning Gap☆88Updated 10 months ago
- ☆117Updated 5 months ago
- Verifiers for LLM Reinforcement Learning☆68Updated 3 months ago