tensorchord / ai-infra-landscape
This is a landscape of the infrastructure that powers the generative AI ecosystem
β137Updated 4 months ago
Alternatives and similar repositories for ai-infra-landscape:
Users that are interested in ai-infra-landscape are comparing it to the libraries listed below
- π An awesome & curated list of best LLMOps tools.β41Updated 2 weeks ago
- Autoscale LLM (vLLM, SGLang, LMDeploy) inferences on Kubernetes (and others)β253Updated last year
- Finetune LLMs on K8s by using Runbooksβ170Updated 6 months ago
- βΈοΈ Easy, advanced inference platform for large language models on Kubernetes. π Star to support our work!β79Updated this week
- Cloud-native way to provide elastic Jupyter Notebooks on Kubernetes. Run remote kernels, natively.β195Updated 2 years ago
- π§― Kubernetes coverage for fault awareness and recovery, works for any LLMOps, MLOps, AI workloads.β26Updated 2 months ago
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.β84Updated this week
- K8s device plugin for GPU sharingβ99Updated last year
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.β62Updated last month
- Helm charts for the KubeRay projectβ40Updated last week
- Self-host LLMs with vLLM and BentoMLβ90Updated this week
- LeaderWorkerSet: An API for deploying a group of pods as a unit of replicationβ296Updated last week
- A diverse, simple, and secure all-in-one LLMOps platformβ100Updated 5 months ago
- JobSet: a k8s native API for distributed ML training and HPC workloadsβ192Updated this week
- β53Updated 2 months ago
- π‘ Deploy AI models and apps to Kubernetes without developing a herniaβ32Updated 9 months ago
- A curated list of awesome projects and resources related to Kubeflow (a CNCF incubating project)β204Updated 3 months ago
- Repository for open inference protocol specificationβ47Updated 7 months ago
- A toolkit for discovering cluster network topology.β36Updated this week
- Using LlamaIndex with Ray for productionizing LLM applicationsβ71Updated last year
- β164Updated this week
- Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetesβ323Updated this week
- Extensible generative AI platform on Kubernetes with OpenAI-compatible APIs.β53Updated this week
- Backend server for envdβ20Updated last year
- GenAI inference performance benchmarking toolβ18Updated last week
- β18Updated 6 months ago
- Holistic job manager on Kubernetesβ112Updated last year
- elastic-gpu-scheduler is a Kubernetes scheduler extender for GPU resources scheduling.β140Updated 2 years ago
- π’ Yet another operator for running large language models on Kubernetes with ease. Powered by Ollama! π«β163Updated this week
- β50Updated last year