fmperf-project / fmperfLinks

Cloud Native Benchmarking of Foundation Models

☆39

Alternatives and similar repositories for fmperf

Users that are interested in fmperf are comparing it to the libraries listed below

Sorting:

IBM / autopilot
A tool to detect infrastructure issues on cloud native AI systems
☆44Updated last week
sgl-project / ome
OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
☆202Updated this week
LMCache / LMBenchmark
Systematic and comprehensive benchmarks for LLM systems.
☆24Updated last month
llm-d / llm-d-kv-cache-manager
Distributed KV cache coordinator
☆46Updated this week
coreweave / nccl-tests
NVIDIA NCCL Tests for Distributed Training
☆102Updated 2 weeks ago
llm-d / llm-d-inference-sim
A light weight vLLM simulator, for mocking out replicas.
☆31Updated this week
NVIDIA / cuda-checkpoint
CUDA checkpoint and restore utility
☆357Updated 6 months ago
NVIDIA / topograph
A toolkit for discovering cluster network topology.
☆61Updated this week
kubernetes-sigs / inference-perf
GenAI inference performance benchmarking tool
☆71Updated this week
NVIDIA / k8s-nim-operator
An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.
☆120Updated this week
Hsword / SpotServe
SpotServe: Serving Generative Large Language Models on Preemptible Instances
☆125Updated last year
llm-d / llm-d-benchmark
llm-d benchmark scripts and tooling
☆21Updated this week
project-codeflare / multi-cluster-app-dispatcher
Holistic job manager on Kubernetes
☆117Updated last year
llm-d / llm-d-inference-scheduler
Inference scheduler for llm-d
☆72Updated this week
Azure / azurehpc-health-checks
Health checks for Azure N- and H-series VMs.
☆48Updated this week
run-ai / llmperf
☆58Updated 10 months ago
LMCache / lmcache-vllm
The driver for LMCache core to run in vLLM
☆45Updated 6 months ago
GPUprobe / gpuprobe-daemon
Lightweight daemon for monitoring CUDA runtime API calls with eBPF uprobes
☆121Updated 4 months ago
NVIDIA / nvidia-resiliency-ext
NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …
☆196Updated last week
NVIDIA / knavigator
knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.
☆69Updated 2 weeks ago
ai-dynamo / nixl
NVIDIA Inference Xfer Library (NIXL)
☆502Updated this week
IBM / LLM-performance-prediction
Predict the performance of LLM inference services
☆19Updated 3 months ago
BaizeAI / kcover
🧯 Kubernetes coverage for fault awareness and recovery, works for any LLMOps, MLOps, AI workloads.
☆31Updated this week
heyfey / vodascheduler
GPU scheduler for elastic/distributed deep learning workloads in Kubernetes cluster (IC2E'23)
☆35Updated last year
Azure / msccl
Microsoft Collective Communication Library
☆65Updated 8 months ago
project-etalon / etalon
LLM Serving Performance Evaluation Harness
☆79Updated 5 months ago
NTHU-LSALAB / Gemini
An efficient GPU resource sharing system with fine-grained control for Linux platforms.
☆84Updated last year
WukLab / preble
Stateful LLM Serving
☆79Updated 4 months ago
llm-d / llm-d-deployer
Helm charts for llm-d
☆51Updated 2 weeks ago
AlibabaPAI / llumnix
Efficient and easy multi-instance LLM serving
☆458Updated this week