openshift-psap / llm-load-testLinks
☆51Updated 6 months ago
Alternatives and similar repositories for llm-load-test
Users that are interested in llm-load-test are comparing it to the libraries listed below
Sorting:
- ☆20Updated this week
- llm-d benchmark scripts and tooling☆44Updated this week
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆146Updated last week
- Helm charts for llm-d☆52Updated 6 months ago
- Model Registry provides a single pane of glass for ML model developers to index and manage models, versions, and ML artifacts metadata. I…☆161Updated this week
- GenAI inference performance benchmarking tool☆142Updated last week
- Resources, demos, recipes,... to work with LLMs on OpenShift with OpenShift AI or Open Data Hub.☆146Updated last month
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆843Updated this week
- This project makes running the InstructLab large language model (LLM) fine-tuning process easy and flexible on OpenShift☆27Updated 5 months ago
- llm-d helm charts and deployment examples☆48Updated last month
- NVIDIA DRA Driver for GPUs☆557Updated this week
- Repository to deploy LLMs with Multi-GPUs in distributed Kubernetes nodes☆29Updated last year
- Auto-tuning for vllm. Getting the best performance out of your LLM deployment (vllm+guidellm+optuna)☆32Updated last week
- Controller for ModelMesh☆242Updated 8 months ago
- InstaSlice Operator facilitates slicing of accelerators using stable APIs☆50Updated this week
- Distributed Model Serving Framework☆185Updated 4 months ago
- Repository to demo GPU Sharing with Time Slicing, MPS, MIG and others☆56Updated last year
- Models as a Service☆73Updated 3 months ago
- Test Orchestrator for Performance and Scalability of AI pLatforms☆16Updated 2 weeks ago
- Collection of demos for building Llama Stack based apps on OpenShift☆59Updated last week
- Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling☆159Updated this week
- Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, T…☆365Updated this week
- Artifacts for the Distributed Workloads stack as part of ODH☆33Updated this week
- Simplified model deployment on llm-d☆28Updated 7 months ago
- Gateway API Inference Extension☆576Updated this week
- ☆17Updated this week
- AI-on-OpenShift website source code☆101Updated 2 months ago
- Containerization and cloud native suite for OPEA☆74Updated last month
- Cloud Native Benchmarking of Foundation Models☆45Updated 6 months ago
- MIG Partition Editor for NVIDIA GPUs☆240Updated this week