opea-project / Enterprise-InferenceLinks
Intel® AI for Enterprise Inference optimizes AI inference services on Intel hardware using Kubernetes Orchestration. It automates LLM model deployment for faster inference, resource provisioning, and optimal settings to simplify processes and reduce manual work.
☆32Updated this week
Alternatives and similar repositories for Enterprise-Inference
Users that are interested in Enterprise-Inference are comparing it to the libraries listed below
Sorting:
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆142Updated last week
- ☆17Updated 7 months ago
- A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deploymen…☆221Updated this week
- GenAI components at micro-service level; GenAI service composer to create mega-service☆193Updated last week
- Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench☆200Updated 8 months ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆131Updated 4 months ago
- ☆40Updated last week
- ☆40Updated last week
- A tool to detect infrastructure issues on cloud native AI systems☆52Updated 4 months ago
- Containerization and cloud native suite for OPEA☆74Updated 3 weeks ago
- A toolkit for discovering cluster network topology.☆93Updated this week
- This repository contains Dockerfiles, scripts, yaml files, Helm charts, etc. used to scale out AI containers with versions of TensorFlow …☆59Updated last week
- Evaluation, benchmark, and scorecard, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety…☆38Updated 3 weeks ago
- Model Server for Kepler☆29Updated 3 months ago
- A collection of useful Go libraries to ease the development of NVIDIA Operators for GPU/NIC management.☆28Updated 2 months ago
- Intent Driven Orchestration enables management of applications through their Service Level Objectives, while minimizing developer and adm…☆48Updated 2 months ago
- Documentation repository for NVIDIA Cloud Native Technologies☆35Updated last week
- Cloud Native Benchmarking of Foundation Models☆44Updated 6 months ago
- This repo contains documents of the OPEA project☆43Updated last month
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆124Updated this week
- Run cloud native workloads on NVIDIA GPUs☆220Updated last week
- Carbon Limiting Auto Tuning for Kubernetes☆37Updated last year
- The NVIDIA GPU driver container allows the provisioning of the NVIDIA driver through the use of containers.☆155Updated last week
- Helm charts for llm-d☆52Updated 6 months ago
- Large Language Model Text Generation Inference on Habana Gaudi☆34Updated 10 months ago
- WG Serving☆34Updated last month
- ☆123Updated 2 months ago
- Route LLM requests to the best model for the task at hand.☆171Updated 2 weeks ago
- For individual users, watsonx Code Assistant can access a local IBM Granite model☆37Updated 7 months ago
- ☆60Updated this week