EachSheep / ShortcutsBench
ShortcutsBench: A Large-Scale Real-World Benchmark for API-Based Agents
☆71Updated this week
Related projects: ⓘ
- ☆93Updated 8 months ago
- Survey Paper List - Efficient LLM and Foundation Models☆190Updated 6 months ago
- Course Material for the UG Course COMP4901Y☆46Updated 4 months ago
- PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".☆70Updated last year
- ☆29Updated last month
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆89Updated last week
- ☆88Updated 3 years ago
- 带中文导读的PhD申请攻略收集☆41Updated last year
- Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of pap…☆153Updated this week
- [UIST 2024] LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Task Automation☆42Updated last month
- paper and its code for AI System☆202Updated 3 weeks ago
- ☆37Updated 5 months ago
- a curated list of high-quality papers on resource-efficient LLMs 🌱☆67Updated 2 months ago
- ☆16Updated 2 months ago
- ☆184Updated 8 months ago
- ATC23 AE☆42Updated last year
- 论文里可以用到的实验图示例☆186Updated 7 months ago
- Simply calling chatgpt APIs and store chat history in csv☆22Updated last year
- ☆52Updated 7 months ago
- A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of …☆84Updated 2 months ago
- Surrogate-based Hyperparameter Tuning System☆26Updated last year
- Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**☆127Updated 3 months ago
- Rotation and Permutation for Advanced Outlier Management and Efficient Quantization of LLMs☆24Updated last week
- MobiSys#114☆21Updated last year
- 📰 Must-read papers and blogs on Speculative Decoding ⚡️☆350Updated last week
- CMU 11868 Large Language Model Systems Spring 2024☆12Updated 4 months ago
- [ICML 2024] Serving LLMs on heterogeneous decentralized clusters.☆14Updated 4 months ago
- ☆25Updated last month
- Official Repo for "LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization"☆25Updated 6 months ago
- State-of-the-art Parameter-Efficient MoE Fine-tuning Method☆60Updated 3 weeks ago