EachSheep / ShortcutsBenchLinks
ShortcutsBench: A Large-Scale Real-World Benchmark for API-Based Agents
☆108Updated 7 months ago
Alternatives and similar repositories for ShortcutsBench
Users that are interested in ShortcutsBench are comparing it to the libraries listed below
Sorting:
- Survey Paper List - Efficient LLM and Foundation Models☆260Updated last year
- Official implementation of MASS: Multi-Agent Simulation Scaling for Portfolio Construction☆161Updated 2 months ago
- Reproducing R1 for Code with Reliable Rewards☆282Updated 8 months ago
- [NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.☆218Updated 8 months ago
- A Stream-based LLM Agent Framework for Continuous Context Sensing and Sharing☆41Updated 3 months ago
- A Comprehensive Benchmark for Software Development.☆127Updated last year
- ☆41Updated 10 months ago
- Paper list for Personal LLM Agents☆424Updated last year
- A Comprehensive Survey on Long Context Language Modeling☆219Updated 2 months ago
- ☆144Updated 4 months ago
- [ICML'25 Oral] Multi-agent Architecture Search via Agentic Supernet☆238Updated 2 months ago
- ☆100Updated 10 months ago
- ☆67Updated last year
- ☆50Updated 5 months ago