Predict the performance of LLM inference services
☆23Sep 18, 2025Updated 6 months ago
Alternatives and similar repositories for LLM-performance-prediction
Users that are interested in LLM-performance-prediction are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Cloud Native Benchmarking of Foundation Models☆45Jul 31, 2025Updated 8 months ago
- SpotServe: Serving Generative Large Language Models on Preemptible Instances☆134Feb 22, 2024Updated 2 years ago
- Serverless Paper Reading and Discussion☆38Jan 9, 2023Updated 3 years ago
- Failure dataset accompanying the paper "How Bad Can a Bug Get? An Empirical Analysis of Software Failures in the OpenStack Cloud Computi…☆10Jun 12, 2020Updated 5 years ago
- ☆16May 14, 2025Updated 10 months ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Simulator for the datacenter, including power, cooling, server and other components☆17Feb 12, 2025Updated last year
- LangBench applications and scripts☆14Jun 7, 2023Updated 2 years ago
- ☆19May 10, 2025Updated 11 months ago
- Releasing the spot availability traces used in "Can't Be Late" paper.☆26Mar 31, 2024Updated 2 years ago
- Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]☆24Nov 21, 2024Updated last year
- A large-scale simulation framework for LLM inference☆581Jul 25, 2025Updated 8 months ago
- Dolphin - a Deep Learning on MIC architecture Project.☆25Oct 30, 2014Updated 11 years ago
- ☆178Mar 12, 2024Updated 2 years ago
- A series of work towards achieving ACV.☆35Apr 1, 2026Updated last week
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- TraceWeaver is a research prototype for transparently tracing requests through a microservice without application instrumentation.☆23Sep 2, 2024Updated last year
- LLM Inference analyzer for different hardware platforms☆109Apr 4, 2026Updated last week
- ☆14Jun 20, 2025Updated 9 months ago
- ☆20Sep 25, 2023Updated 2 years ago
- ☆18Oct 31, 2022Updated 3 years ago
- ☆10Dec 10, 2024Updated last year
- Know Your Enemy To Save Cloud Energy: Energy-Performance Characterization of Machine Learning Serving (HPCA '23)☆14Jun 20, 2025Updated 9 months ago
- ☆30Mar 20, 2022Updated 4 years ago
- 🧯 Kubernetes coverage for fault awareness and recovery, works for any LLMOps, MLOps, AI workloads.☆35Mar 31, 2026Updated last week
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- LLMServingSim 2.0: A Unified Simulator for Heterogeneous and Disaggregated LLM Serving Infrastructure☆235Mar 13, 2026Updated 3 weeks ago
- A tool to detect infrastructure issues on cloud native AI systems☆53Sep 18, 2025Updated 6 months ago
- Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT scenarios.☆106Aug 14, 2024Updated last year
- Pytorch implementation for the pilot study on the robustness of latent diffusion models.☆12Jun 20, 2023Updated 2 years ago
- LLM Serving Performance Evaluation Harness☆84Feb 25, 2025Updated last year
- From Task-based to Instruction-based Automated Log Analysis☆22Jan 7, 2025Updated last year
- Collect information about 2018 CS courses in CSE of SYSU.☆11Jun 29, 2022Updated 3 years ago
- Official Tensorflow implementation for "Improving the Transferability of Adversarial Samples by Path-Augmented Method" (CVPR 2023).☆12Jun 16, 2023Updated 2 years ago
- Code for SIGKDD2025 paper: An Efficient Diffusion-based Non-Autoregressive Solver for Traveling Salesman Problem☆14Jan 28, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- An Observability Framework for AI Training☆68Mar 25, 2026Updated 2 weeks ago
- This repository manifests set which is made to build a prototype system of TraceZip, made by 4 pieces.☆14Jul 17, 2025Updated 8 months ago
- ☆80Apr 4, 2026Updated last week
- ☆17May 29, 2025Updated 10 months ago
- Kernel Playground - A playground to run large scale experiments on the Linux Kernel☆20Nov 8, 2025Updated 5 months ago
- ☆10Jun 4, 2024Updated last year
- A Suite for Parallel Inference of Diffusion Transformers (DiTs) on multi-GPU Clusters☆58Jul 23, 2024Updated last year