opea-project / Enterprise-InferenceLinks
Intel® AI for Enterprise Inference optimizes AI inference services on Intel hardware using Kubernetes Orchestration. It automates LLM model deployment for faster inference, resource provisioning, and optimal settings to simplify processes and reduce manual work.
☆30Updated this week
Alternatives and similar repositories for Enterprise-Inference
Users that are interested in Enterprise-Inference are comparing it to the libraries listed below
Sorting:
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆141Updated this week
- ☆17Updated 6 months ago
- Model Server for Kepler☆29Updated 2 months ago
- ☆39Updated this week
- A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deploymen…☆216Updated last week
- A collection of useful Go libraries to ease the development of NVIDIA Operators for GPU/NIC management.☆25Updated 3 weeks ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆131Updated 2 months ago
- Containerization and cloud native suite for OPEA☆72Updated 2 months ago
- Helm charts for llm-d☆50Updated 4 months ago
- Carbon Limiting Auto Tuning for Kubernetes☆37Updated last year
- A toolkit for discovering cluster network topology.☆86Updated last week
- The NVIDIA GPU driver container allows the provisioning of the NVIDIA driver through the use of containers.☆148Updated this week
- This repository contains Dockerfiles, scripts, yaml files, Helm charts, etc. used to scale out AI containers with versions of TensorFlow …☆56Updated this week
- WG Serving☆32Updated last week
- ☆40Updated 2 weeks ago
- Cloud Native Benchmarking of Foundation Models☆44Updated 4 months ago
- GenAI Studio is a low code platform to enable users to construct, evaluate, and benchmark GenAI applications. The platform also provide c…☆55Updated 2 weeks ago
- Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench☆193Updated 7 months ago
- Run cloud native workloads on NVIDIA GPUs☆210Updated 2 months ago
- GenAI components at micro-service level; GenAI service composer to create mega-service☆190Updated this week
- For individual users, watsonx Code Assistant can access a local IBM Granite model☆37Updated 5 months ago
- InstaSlice facilitates the use of Dynamic Resource Allocation (DRA) on Kubernetes clusters for GPU sharing☆30Updated last year
- Test Orchestrator for Performance and Scalability of AI pLatforms☆16Updated this week
- This repo contains documents of the OPEA project☆43Updated 3 months ago
- 🎉 An awesome & curated list of best LLMOps tools.☆175Updated 2 weeks ago
- ☆20Updated this week
- GenAI inference performance benchmarking tool☆137Updated 2 weeks ago
- A tool to detect infrastructure issues on cloud native AI systems☆52Updated 3 months ago
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆119Updated this week
- Health checks for Azure N- and H-series VMs.☆55Updated 2 weeks ago