Predict the performance of LLM inference services
☆23Sep 18, 2025Updated 6 months ago
Alternatives and similar repositories for LLM-performance-prediction
Users that are interested in LLM-performance-prediction are comparing it to the libraries listed below
Sorting:
- Cloud Native Benchmarking of Foundation Models☆45Jul 31, 2025Updated 7 months ago
- SpotServe: Serving Generative Large Language Models on Preemptible Instances☆134Feb 22, 2024Updated 2 years ago
- Serverless Paper Reading and Discussion☆38Jan 9, 2023Updated 3 years ago
- ☆16May 14, 2025Updated 10 months ago
- Simulator for the datacenter, including power, cooling, server and other components☆17Feb 12, 2025Updated last year
- LangBench applications and scripts☆14Jun 7, 2023Updated 2 years ago
- ☆19May 10, 2025Updated 10 months ago
- Releasing the spot availability traces used in "Can't Be Late" paper.☆25Mar 31, 2024Updated last year
- Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]☆24Nov 21, 2024Updated last year
- A large-scale simulation framework for LLM inference☆556Jul 25, 2025Updated 7 months ago
- Dolphin - a Deep Learning on MIC architecture Project.☆25Oct 30, 2014Updated 11 years ago
- ☆176Mar 12, 2024Updated 2 years ago
- A series of work towards achieving ACV.☆32Jan 28, 2026Updated last month
- LLM Inference analyzer for different hardware platforms☆108Feb 17, 2026Updated last month
- TraceWeaver is a research prototype for transparently tracing requests through a microservice without application instrumentation.☆23Sep 2, 2024Updated last year
- ☆14Jun 20, 2025Updated 9 months ago
- ☆20Sep 25, 2023Updated 2 years ago
- ☆18Oct 31, 2022Updated 3 years ago
- Know Your Enemy To Save Cloud Energy: Energy-Performance Characterization of Machine Learning Serving (HPCA '23)☆14Jun 20, 2025Updated 9 months ago
- ☆10Dec 10, 2024Updated last year
- ☆30Mar 20, 2022Updated 4 years ago
- LLMServingSim 2.0: A Unified Simulator for Heterogeneous and Disaggregated LLM Serving Infrastructure☆208Mar 13, 2026Updated last week
- A tool to detect infrastructure issues on cloud native AI systems☆52Sep 18, 2025Updated 6 months ago
- Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT scenarios.☆106Aug 14, 2024Updated last year
- From Task-based to Instruction-based Automated Log Analysis☆23Jan 7, 2025Updated last year
- Pytorch implementation for the pilot study on the robustness of latent diffusion models.☆12Jun 20, 2023Updated 2 years ago
- LLM Serving Performance Evaluation Harness☆83Feb 25, 2025Updated last year
- CAShift: Benchmarking Log-Based Cloud Attack Detection under Normality Shift (FSE 2025)☆13May 19, 2025Updated 10 months ago
- Collect information about 2018 CS courses in CSE of SYSU.☆11Jun 29, 2022Updated 3 years ago
- An Observability Framework for AI Training☆66Updated this week
- E-commerce search benchmark is the first end-to-end application benchmark for e-commerce search system with personalized recommendations.…☆45Feb 15, 2023Updated 3 years ago
- Official Tensorflow implementation for "Improving the Transferability of Adversarial Samples by Path-Augmented Method" (CVPR 2023).☆12Jun 16, 2023Updated 2 years ago
- Code for SIGKDD2025 paper: An Efficient Diffusion-based Non-Autoregressive Solver for Traveling Salesman Problem☆14Jan 28, 2025Updated last year
- This repository manifests set which is made to build a prototype system of TraceZip, made by 4 pieces.☆14Jul 17, 2025Updated 8 months ago
- ☆78Mar 14, 2026Updated last week
- ☆17May 29, 2025Updated 9 months ago
- Kernel Playground - A playground to run large scale experiments on the Linux Kernel☆17Nov 8, 2025Updated 4 months ago
- ☆10Jun 4, 2024Updated last year
- A Suite for Parallel Inference of Diffusion Transformers (DiTs) on multi-GPU Clusters☆57Jul 23, 2024Updated last year