tensorchord / ai-infra-landscape
This is a landscape of the infrastructure that powers the generative AI ecosystem
β142Updated 6 months ago
Alternatives and similar repositories for ai-infra-landscape:
Users that are interested in ai-infra-landscape are comparing it to the libraries listed below
- π An awesome & curated list of best LLMOps tools.β87Updated last week
- Autoscale LLM (vLLM, SGLang, LMDeploy) inferences on Kubernetes (and others)β263Updated last year
- Repository for open inference protocol specificationβ53Updated 9 months ago
- Finetune LLMs on K8s by using Runbooksβ170Updated 7 months ago
- Helm charts for the KubeRay projectβ43Updated 2 weeks ago
- K8s device plugin for GPU sharingβ100Updated last year
- Extensible generative AI platform on Kubernetes with OpenAI-compatible APIs.β67Updated last week
- Cloud-native way to provide elastic Jupyter Notebooks on Kubernetes. Run remote kernels, natively.β196Updated 3 years ago
- π‘ Deploy AI models and apps to Kubernetes without developing a herniaβ32Updated 11 months ago
- ChatData π π brings RAG to real applications with FREEβ¨ knowledge bases. Now enjoy your chat with 6 million wikipedia pages and 2 milliβ¦β171Updated 5 months ago
- π§― Kubernetes coverage for fault awareness and recovery, works for any LLMOps, MLOps, AI workloads.β29Updated 4 months ago
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.β65Updated last week
- βΈοΈ Easy, advanced inference platform for large language models on Kubernetes. π Star to support our work!β127Updated this week
- Holistic job manager on Kubernetesβ115Updated last year
- LeaderWorkerSet: An API for deploying a group of pods as a unit of replicationβ411Updated this week
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.β92Updated this week
- MaK(Mac+Kubernetes)llama - Running and orchestrating large language models (LLMs) on Kubernetes with macOS nodes.β39Updated 11 months ago
- JobSet: a k8s native API for distributed ML training and HPC workloadsβ219Updated last week
- Envoy AI Gateway is an open source project for using Envoy Gateway to handle request traffic from application clients to Generative AI seβ¦β227Updated this week
- GenAI inference performance benchmarking toolβ39Updated 3 weeks ago
- β28Updated 11 months ago
- A distributed engine for intelligent workloadβ27Updated 2 months ago
- Using LlamaIndex with Ray for productionizing LLM applicationsβ71Updated last year
- MLFlow Deployment Plugin for Ray Serveβ44Updated 3 years ago
- β108Updated 11 months ago
- Self-host LLMs with vLLM and BentoMLβ106Updated last week
- This repository contains statistics about the AI Infrastructure products.β18Updated last month
- Open-source observability for your LLM application.β51Updated 3 months ago
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.β65Updated last year
- elastic-gpu-scheduler is a Kubernetes scheduler extender for GPU resources scheduling.β140Updated 2 years ago