opea-project / Enterprise-InferenceLinks
Intel® AI for Enterprise Inference optimizes AI inference services on Intel hardware using Kubernetes Orchestration. It automates LLM model deployment for faster inference, resource provisioning, and optimal settings to simplify processes and reduce manual work.
☆32Updated this week
Alternatives and similar repositories for Enterprise-Inference
Users that are interested in Enterprise-Inference are comparing it to the libraries listed below
Sorting:
- GenAI components at micro-service level; GenAI service composer to create mega-service☆193Updated 2 weeks ago
- Helm charts for llm-d☆52Updated 6 months ago
- A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deploymen…☆221Updated this week
- Containerization and cloud native suite for OPEA☆74Updated last month
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆146Updated this week
- This repo contains documents of the OPEA project☆43Updated last month
- Model Server for Kepler☆29Updated 3 months ago
- This repository contains Dockerfiles, scripts, yaml files, Helm charts, etc. used to scale out AI containers with versions of TensorFlow …☆60Updated 2 weeks ago
- ☆17Updated 7 months ago
- A tool to detect infrastructure issues on cloud native AI systems☆52Updated 4 months ago
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆124Updated this week
- Documentation repository for NVIDIA Cloud Native Technologies☆35Updated this week
- A toolkit for discovering cluster network topology.☆96Updated last week
- Cloud Native Benchmarking of Foundation Models☆45Updated 6 months ago
- The NVIDIA GPU driver container allows the provisioning of the NVIDIA driver through the use of containers.☆155Updated last week
- A collection of useful Go libraries to ease the development of NVIDIA Operators for GPU/NIC management.☆28Updated this week
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆131Updated 4 months ago
- Evaluation, benchmark, and scorecard, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety…☆38Updated last month
- Carbon Limiting Auto Tuning for Kubernetes☆37Updated last year
- Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench☆202Updated 9 months ago
- ☆43Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆16Updated last week
- Large Language Model Text Generation Inference on Habana Gaudi☆34Updated 10 months ago
- ☆280Updated this week
- WG Serving☆34Updated last month
- ☆76Updated this week
- GenAI Studio is a low code platform to enable users to construct, evaluate, and benchmark GenAI applications. The platform also provide c…☆59Updated 3 weeks ago
- 🎉 An awesome & curated list of best LLMOps tools.☆190Updated this week
- Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling☆159Updated this week
- ☆44Updated this week