NVIDIA / nim-deploy
A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deployment.
☆139Updated this week
Related projects ⓘ
Alternatives and complementary repositories for nim-deploy
- Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench☆107Updated this week
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆57Updated this week
- NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other ent…☆85Updated this week
- ☆120Updated this week
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆235Updated this week
- ☆55Updated this week
- Run cloud native workloads on NVIDIA GPUs☆134Updated this week
- End-to-End LLM Guide☆97Updated 4 months ago
- NVIDIA AI Blueprint for multimodal PDF data extraction for enterprise RAG☆53Updated this week
- MIG Partition Editor for NVIDIA GPUs☆174Updated this week
- Self-host LLMs with vLLM and BentoML☆74Updated this week
- Helm charts for the KubeRay project☆33Updated last month
- IBM development fork of https://github.com/huggingface/text-generation-inference☆57Updated last month
- ☆192Updated this week
- Tutorial for building LLM router☆163Updated 4 months ago
- This repo contains documents of the OPEA project☆27Updated this week
- Containerization and cloud native suite for OPEA☆30Updated this week
- Infrastructure as code for GPU accelerated managed Kubernetes clusters.☆47Updated 6 months ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆103Updated last week
- Using LlamaIndex with Ray for productionizing LLM applications☆71Updated last year
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆165Updated 2 weeks ago
- Large Language Model Text Generation Inference on Habana Gaudi☆27Updated this week
- An NVIDIA AI Workbench example project for fine-tuning a Mistral 7B model☆49Updated 5 months ago
- Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inferen…☆51Updated this week
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆134Updated 3 months ago
- GenAI components at micro-service level; GenAI service composer to create mega-service☆76Updated this week
- Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.☆472Updated this week
- Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.☆209Updated this week
- Collection of reference workflows for building intelligent agents with NIMs☆114Updated 2 weeks ago
- Generative AI Examples is a collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Open…☆277Updated this week