EachSheep / ShortcutsBenchLinks
ShortcutsBench: A Large-Scale Real-World Benchmark for API-Based Agents
☆107Updated 5 months ago
Alternatives and similar repositories for ShortcutsBench
Users that are interested in ShortcutsBench are comparing it to the libraries listed below
Sorting:
- Official implementation of MASS: Multi-Agent Simulation Scaling for Portfolio Construction☆153Updated last week
- Reproducing R1 for Code with Reliable Rewards☆272Updated 6 months ago
- Paper list for Personal LLM Agents☆421Updated last year
- [NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.☆207Updated 5 months ago
- A Comprehensive Benchmark for Software Development.☆119Updated last year
- ☆91Updated 8 months ago
- Survey Paper List - Efficient LLM and Foundation Models☆257Updated last year
- ☆136Updated 2 months ago
- A Stream-based LLM Agent Framework for Continuous Context Sensing and Sharing☆41Updated last month
- [ICML'25 Oral] Multi-agent Architecture Search via Agentic Supernet☆214Updated 2 weeks ago
- [ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration☆58Updated 9 months ago
- ☆168Updated last month
- ☆65Updated last year
- ☆29Updated last month
- MemGen: Weaving Generative Latent Memory for Self-Evolving Agents☆196Updated 3 weeks ago
- A Comprehensive Survey on Long Context Language Modeling☆204Updated this week
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆97Updated 9 months ago
- A curated list of Awesome-LLM-Ensemble papers for the survey "Harnessing Multiple Large Language Models: A Survey on LLM Ensemble"☆164Updated last week
- Official repository for our paper "FullStack Bench: Evaluating LLMs as Full Stack Coders"☆107Updated 6 months ago
- A Framework for LLM-based Multi-Agent Reinforced Training and Inference☆348Updated last week
- End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆324Updated 2 months ago
- Neural Code Intelligence Survey 2024; Reading lists and resources☆276Updated 4 months ago
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton☆36Updated 9 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆135Updated 9 months ago
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!☆70Updated 7 months ago
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆194Updated 2 weeks ago
- [ICML 2025] Reward-guided Speculative Decoding (RSD) for efficiency and effectiveness.☆51Updated 6 months ago
- [NAACL 2025 Main Selected Oral] Repository for the paper: Prompt Compression for Large Language Models: A Survey☆35Updated 6 months ago
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆97Updated 2 months ago
- awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.☆211Updated 3 weeks ago