aws-samples / scalable-hw-agnostic-inference

A hardware-agnostic (NVIDIA's GPUs and AWS Inferentia accelerators) deployment of computer-vision models (e.g., YOLO, ViT), generate text and text-to-image (e.g., Llama3 and Stable Diffusion ) on EKS controlled by K8s ingress in routing-time and Karpenter in scheduling-time that is scaled by KEDA.
20Updated last week

Alternatives and similar repositories for scalable-hw-agnostic-inference:

Users that are interested in scalable-hw-agnostic-inference are comparing it to the libraries listed below