IBM/LLM-performance-prediction

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/IBM/LLM-performance-prediction)

IBM / LLM-performance-prediction

Predict the performance of LLM inference services

☆23

Alternatives and similar repositories for LLM-performance-prediction

Users that are interested in LLM-performance-prediction are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Hsword / SpotServe
View on GitHub
SpotServe: Serving Generative Large Language Models on Preemptible Instances
☆135Feb 22, 2024Updated 2 years ago
Jeffwan / serverless-research
View on GitHub
Serverless Paper Reading and Discussion
☆38Jan 9, 2023Updated 3 years ago
dessertlab / Fault-Injection-Dataset
View on GitHub
Failure dataset accompanying the paper "How Bad Can a Bug Get? An Empirical Analysis of Software Failures in the OpenStack Cloud Computi…
☆10Jun 12, 2020Updated 6 years ago
dsrg-uoft / LangBench
View on GitHub
LangBench applications and scripts
☆14Jun 7, 2023Updated 3 years ago
JhengLu / OpenInfra
View on GitHub
Simulator for the datacenter, including power, cooling, server and other components
☆19Feb 12, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
skypilot-org / spot-traces
View on GitHub
Releasing the spot availability traces used in "Can't Be Late" paper.
☆26Mar 31, 2024Updated 2 years ago
inpluslab-wuhui / Systems-for-Foundation-Models
View on GitHub
☆20May 10, 2025Updated last year
dywsjtu / apparate
View on GitHub
Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]
☆24Nov 21, 2024Updated last year
EMDC-OS / CorePartitioning
View on GitHub
☆13Jun 20, 2025Updated last year
aFuerst / faascache-sim
View on GitHub
☆18Oct 31, 2022Updated 3 years ago
InternLM / AcmeTrace
View on GitHub
☆179Mar 12, 2024Updated 2 years ago
chengjiagan / RunD_ATC22
View on GitHub
☆20Sep 25, 2023Updated 2 years ago
abhibambhaniya / GenZ-LLM-Analyzer
View on GitHub
LLM Inference analyzer for different hardware platforms
☆119Jun 23, 2026Updated 3 weeks ago
microsoft / vidur
View on GitHub
Accurate, large-scale, and extensible simulator for LLM inference Systems
☆642Jul 25, 2025Updated 11 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
ucsdsysnet / faasnap
View on GitHub
☆30Mar 20, 2022Updated 4 years ago
nl2logql / LogQLLM
View on GitHub
☆10Dec 10, 2024Updated last year
BaizeAI / kcover
View on GitHub
🧯 Kubernetes coverage for fault awareness and recovery, works for any LLMOps, MLOps, AI workloads.
☆35Updated this week
patronus-ai / trail-benchmark
View on GitHub
☆21May 14, 2025Updated last year
microsoft / batch-inference
View on GitHub
Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT scenarios.
☆106Aug 14, 2024Updated last year
IBM / autopilot
View on GitHub
A tool to detect infrastructure issues on cloud native AI systems
☆53Sep 18, 2025Updated 10 months ago
project-etalon / etalon
View on GitHub
LLM Serving Performance Evaluation Harness
☆84Feb 25, 2025Updated last year
jpzhang1810 / PAM
View on GitHub
Official Tensorflow implementation for "Improving the Transferability of Adversarial Samples by Path-Augmented Method" (CVPR 2023).
☆12Jun 16, 2023Updated 3 years ago
jpzhang1810 / LDM-Robustness
View on GitHub
Pytorch implementation for the pilot study on the robustness of latent diffusion models.
☆13Jun 20, 2023Updated 3 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
OpsPAI / TraceZip
View on GitHub
This repository manifests set which is made to build a prototype system of TraceZip, made by 4 pieces.
☆14Jul 17, 2025Updated last year
fish98 / CAShift
View on GitHub
CAShift: Benchmarking Log-Based Cloud Attack Detection under Normality Shift (FSE 2025)
☆15Jun 25, 2026Updated 3 weeks ago
alibaba / eCommerceSearchBench
View on GitHub
E-commerce search benchmark is the first end-to-end application benchmark for e-commerce search system with personalized recommendations.…
☆44Feb 15, 2023Updated 3 years ago
HPMLL / BurstGPT
View on GitHub
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
☆279Jun 30, 2026Updated 3 weeks ago
utnslab / Medes
View on GitHub
Deduplication over dis-aggregated memory for Serverless Computing
☆14Mar 21, 2022Updated 4 years ago
AaltoPML / human-in-the-loop-predictive-maintenance
View on GitHub
☆10Jun 4, 2024Updated 2 years ago
keanudicap / MSQA
View on GitHub
Microsoft question-answering dataset
☆10Jun 16, 2023Updated 3 years ago
PipeFusion / PipeFusion
View on GitHub
A Suite for Parallel Inference of Diffusion Transformers (DiTs) on multi-GPU Clusters
☆58May 3, 2026Updated 2 months ago
mu5358271 / spark-on-fargate
View on GitHub
Serverless Apache Spark On AWS Fargate
☆17Jun 1, 2019Updated 7 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
microsoft / ParrotServe
View on GitHub
[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable
☆222Sep 21, 2024Updated last year
eth-easl / pccheck
View on GitHub
☆12Apr 23, 2026Updated 2 months ago
alibaba / hap
View on GitHub
☆16Apr 13, 2024Updated 2 years ago
nabenabe0928 / meta-learn-tpe
View on GitHub
[IJCAI'23] Speeding Up Multi-Objective Hyperparameter Optimization by Task Similarity-Based Meta-Learning for the Tree-Structured Parzen …
☆10Apr 24, 2026Updated 2 months ago
charliewwdev / alpine-desktop
View on GitHub
☆13Jul 7, 2017Updated 9 years ago
Montimage / maip
View on GitHub
A platform that provides users with easy access to AI services developed by Montimage and usage of explainable AI techniques (e.g., LIME,…
☆10Feb 17, 2026Updated 5 months ago
llylly / RANUM
View on GitHub
[ICSE 2023] Differentiable interpretation and failure-inducing input generation for neural network numerical bugs.
☆13Jan 5, 2024Updated 2 years ago