opea-project / Enterprise-InferenceLinks
Intel® AI for Enterprise Inference optimizes AI inference services on Intel hardware using Kubernetes Orchestration. It automates LLM model deployment for faster inference, resource provisioning, and optimal settings to simplify processes and reduce manual work.
☆31Updated this week
Alternatives and similar repositories for Enterprise-Inference
Users that are interested in Enterprise-Inference are comparing it to the libraries listed below
Sorting:
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆140Updated 3 weeks ago
- ☆17Updated 6 months ago
- A toolkit for discovering cluster network topology.☆89Updated last month
- A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deploymen…☆216Updated last week
- This repository contains Dockerfiles, scripts, yaml files, Helm charts, etc. used to scale out AI containers with versions of TensorFlow …☆58Updated this week
- A collection of useful Go libraries to ease the development of NVIDIA Operators for GPU/NIC management.☆28Updated last month
- GenAI components at micro-service level; GenAI service composer to create mega-service☆192Updated this week
- Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench☆198Updated 8 months ago
- Helm charts for llm-d☆50Updated 5 months ago
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆118Updated this week
- Containerization and cloud native suite for OPEA☆73Updated this week
- A tool to detect infrastructure issues on cloud native AI systems☆52Updated 3 months ago
- Run cloud native workloads on NVIDIA GPUs☆213Updated this week
- ☆40Updated 2 weeks ago
- Model Server for Kepler☆29Updated 3 months ago
- ☆20Updated 2 weeks ago
- This repo contains documents of the OPEA project☆43Updated 3 weeks ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆131Updated 3 months ago
- Evaluation, benchmark, and scorecard, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety…☆38Updated this week
- WG Serving☆32Updated 3 weeks ago
- Documentation repository for NVIDIA Cloud Native Technologies☆34Updated this week
- The NVIDIA GPU driver container allows the provisioning of the NVIDIA driver through the use of containers.☆150Updated this week
- Cloud Native Benchmarking of Foundation Models☆44Updated 5 months ago
- Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling☆138Updated this week
- Carbon Limiting Auto Tuning for Kubernetes☆37Updated last year
- GenAI inference performance benchmarking tool☆140Updated 2 weeks ago
- ☆275Updated this week
- Test Orchestrator for Performance and Scalability of AI pLatforms☆16Updated last week
- Large Language Model Text Generation Inference on Habana Gaudi☆34Updated 9 months ago
- Examples for building and running LLM services and applications locally with Podman☆188Updated 5 months ago