A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deployment.
☆231Apr 30, 2026Updated this week
Alternatives and similar repositories for nim-deploy
Users that are interested in nim-deploy are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench☆224May 1, 2025Updated last year
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆157Updated this week
- ☆60Feb 5, 2026Updated 2 months ago
- Run cloud native workloads on NVIDIA GPUs☆233Jan 22, 2026Updated 3 months ago
- Infrastructure as code for GPU accelerated managed Kubernetes clusters.☆59Apr 30, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.☆3,969Mar 30, 2026Updated last month
- ☆11Mar 16, 2026Updated last month
- ☆12Dec 20, 2025Updated 4 months ago
- Re-scoring a set of docked ligands with off-the-shelf algorithms to assess utility in virtual screening☆11Oct 13, 2021Updated 4 years ago
- HyDE based RAG using NVIDIA NIM.☆16Mar 20, 2024Updated 2 years ago
- DRA Driver for NVIDIA GPUs☆633Updated this week
- ☆20Mar 11, 2026Updated last month
- markdown docs☆96Feb 1, 2026Updated 3 months ago
- Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.☆131Updated this week
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inferen…☆74Apr 15, 2026Updated 2 weeks ago
- MIG Partition Editor for NVIDIA GPUs☆248Apr 26, 2026Updated last week
- Comprehensive, scalable ML inference architecture using Amazon EKS, leveraging Graviton processors for cost-effective CPU-based inference…☆21Mar 12, 2026Updated last month
- Kubernetes Operator, Helm Charts, Ansible Playbooks, and utility scripts for large-scale AIStore deployments on Kubernetes.☆131Updated this week
- Gateway API Inference Extension☆657Apr 26, 2026Updated last week
- The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.☆2,243Updated this week
- Training NVIDIA NeMo Megatron Large Language Model (LLM) using NeMo Framework on Google Kubernetes Engine☆16Apr 28, 2025Updated last year
- Custom Scheduler to deploy ML models to TRTIS for GPU Sharing☆12Apr 1, 2020Updated 6 years ago
- NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes☆2,661Apr 24, 2026Updated last week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Experimenting text-embeddings-inference server on both CPU and GPU☆18Oct 25, 2023Updated 2 years ago
- LeaderWorkerSet: An API for deploying a group of pods as a unit of replication☆712Updated this week
- Plugins for Sonobuoy☆62May 20, 2025Updated 11 months ago
- Create, List, Update, Delete Amazon EKS clusters. Deploy and manage software on EKS. Run distributed model training and inference example…☆66Apr 20, 2026Updated last week
- WG Serving☆35Mar 24, 2026Updated last month
- ☆15Apr 13, 2026Updated 2 weeks ago
- This repository contains the results and code for the MLPerf™ Training v3.0 benchmark.☆12Aug 10, 2023Updated 2 years ago
- An NVIDIA AI Workbench example project for Retrieval Augmented Generation (RAG)☆368Aug 12, 2025Updated 8 months ago
- ☆14May 29, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A Datacenter Scale Distributed Inference Serving Framework☆6,701Updated this week
- GPU Environment Management for JupyterLab☆26Feb 19, 2024Updated 2 years ago
- ☆25Apr 4, 2026Updated 3 weeks ago
- This repository contains tutorials and examples for Triton Inference Server☆830Apr 21, 2026Updated last week
- TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizat…☆13,487Updated this week
- Selenium Grid in ECS using Fargate Spot Containers☆14Feb 1, 2023Updated 3 years ago
- ☆67Mar 28, 2025Updated last year