EachSheep / ShortcutsBenchLinks
ShortcutsBench: A Large-Scale Real-World Benchmark for API-Based Agents
☆99Updated last month
Alternatives and similar repositories for ShortcutsBench
Users that are interested in ShortcutsBench are comparing it to the libraries listed below
Sorting:
- Official implementation of MASS: Multi-Agent Simulation Scaling for Portfolio Construction☆119Updated 2 weeks ago
- Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**☆186Updated 3 months ago
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton☆28Updated 3 months ago
- Reproducing R1 for Code with Reliable Rewards☆208Updated last month
- [ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration☆49Updated 3 months ago
- This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding co…☆147Updated 3 months ago
- ☆42Updated 6 months ago
- PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".☆86Updated 2 years ago
- Survey Paper List - Efficient LLM and Foundation Models☆248Updated 8 months ago
- Multi-Candidate Speculative Decoding☆35Updated last year
- Simple extension on vLLM to help you speed up reasoning model without training.☆158Updated last week
- ☆12Updated 3 months ago
- Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.☆17Updated last month
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆158Updated 8 months ago
- [ACL 2025 main] FR-Spec: Frequency-Ranked Speculative Sampling☆29Updated last week
- A Stream-based LLM Agent Framework for Continuous Context Sensing and Sharing☆38Updated 6 months ago
- This is the official Python version of CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Act…☆16Updated 7 months ago
- Zotero plugin for quick access to paper CCF rating, conference/journal, and citation count.☆43Updated last month
- [NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank☆48Updated 7 months ago
- ☆119Updated 5 months ago
- A version of verl to support tool use☆172Updated this week
- This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"☆51Updated 10 months ago
- ☆19Updated 4 months ago
- ☆50Updated 6 months ago
- LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Task Automation☆60Updated 9 months ago
- Neural Code Intelligence Survey 2024; Reading lists and resources☆260Updated 2 months ago
- [COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding☆253Updated 9 months ago
- The Official Implementation of Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference☆76Updated 4 months ago
- The repo for In-context Autoencoder☆127Updated last year
- ☆26Updated 3 months ago