rh-aiservices-bu / gpu-partitioning-guideLinks
Repository to demo GPU Sharing with Time Slicing, MPS, MIG and others
☆55Updated last year
Alternatives and similar repositories for gpu-partitioning-guide
Users that are interested in gpu-partitioning-guide are comparing it to the libraries listed below
Sorting:
- InstaSlice Operator facilitates slicing of accelerators using stable APIs☆48Updated this week
- GenAI inference performance benchmarking tool☆141Updated last week
- Holistic job manager on Kubernetes☆115Updated last year
- llm-d helm charts and deployment examples☆48Updated last month
- llm-d benchmark scripts and tooling☆41Updated last week
- Example DRA driver that developers can fork and modify to get them started writing their own.☆112Updated last week
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆139Updated last week
- ☆20Updated this week
- open-cluster-management governance material.☆64Updated 3 months ago
- Simplified model deployment on llm-d☆28Updated 6 months ago
- Enabling Kubernetes to make pod placement decisions with platform intelligence.☆176Updated 11 months ago
- Resources, demos, recipes,... to work with LLMs on OpenShift with OpenShift AI or Open Data Hub.☆146Updated 2 weeks ago
- Cloud Native Artifacial Intelligence Model Format Specification☆171Updated this week
- ☆34Updated 3 weeks ago
- DOCA Platform manages provisioning and service orchestration for Bluefield DPUs☆71Updated this week
- ☆195Updated this week
- A collection of community maintained NRI plugins☆100Updated last week
- CNI DRA Driver☆38Updated 3 months ago
- 🏃🏿♀️🏃🏽♀️🏃🏻♂️🕒CNCF Technical Advisory Group for Runtime☆95Updated 9 months ago
- InstaSlice facilitates the use of Dynamic Resource Allocation (DRA) on Kubernetes clusters for GPU sharing☆30Updated last year
- The kernel module management operator builds, signs and loads kernel modules in Kubernetes clusters.☆115Updated 3 weeks ago
- NVIDIA Network Operator☆318Updated this week
- WG Serving☆34Updated last month
- This project makes running the InstructLab large language model (LLM) fine-tuning process easy and flexible on OpenShift☆27Updated 4 months ago
- NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated compu…☆156Updated this week
- Inference scheduler for llm-d☆120Updated this week
- Test Orchestrator for Performance and Scalability of AI pLatforms☆16Updated this week
- Kubernetes integration for OVN☆91Updated last week
- A tool to detect infrastructure issues on cloud native AI systems☆52Updated 4 months ago
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.☆74Updated 6 months ago