llmos-ai / llmosLinks
An Open Source, Cloud-native AI Infrastructure Platform. Not Just GPUs.
β49Updated last month
Alternatives and similar repositories for llmos
Users that are interested in llmos are comparing it to the libraries listed below
Sorting:
- π An awesome & curated list of best LLMOps tools.β161Updated last week
- βΈοΈ Easy, advanced inference platform for large language models on Kubernetes. π Star to support our work!β261Updated 3 weeks ago
- A diverse, simple, and secure all-in-one LLMOps platformβ108Updated last year
- Review/Check GGUF files and estimate the memory usage and maximum tokens per second.β207Updated last month
- Route LLM requests to the best model for the task at hand.β107Updated 2 weeks ago
- π‘ Deploy AI models and apps to Kubernetes without developing a herniaβ33Updated last year
- Self-host LLMs with vLLM and BentoMLβ150Updated last week
- Open Weight, tool-calling LLMsβ156Updated 11 months ago
- LM inference server implementation based on *.cpp.β276Updated last month
- β255Updated this week
- Extensible generative AI platform on Kubernetes with OpenAI-compatible APIs.β91Updated last week
- Inference scheduler for llm-dβ95Updated this week
- Wraps any OpenAI API interface as Responses with MCPs support so it supports Codex. Adding any missing stateful features. Ollama and Vllmβ¦β110Updated 3 months ago
- MCP server connecting to Kubernetesβ347Updated 2 weeks ago
- MaK(Mac+Kubernetes)llama - Running and orchestrating large language models (LLMs) on Kubernetes with macOS nodes.β42Updated last year
- Distributed KV cache coordinatorβ72Updated this week
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.β70Updated 2 months ago
- π’ Yet another operator for running large language models on Kubernetes with ease. Powered by Ollama! π«β217Updated last week
- ποΈ Fine-tune, build, and deploy open-source LLMs easily!β478Updated last week
- Knowledge for GPTScriptβ29Updated 11 months ago
- β¨Kubewizard is An AI-Agent for automated Kubernetes troubleshooting, and management, based on LangChain and k8s related tools.β26Updated 8 months ago
- A holistic framework to enable the design, development, and evaluation of autonomous AIOps agents.β697Updated last week
- A toolkit for discovering cluster network topology.β70Updated last week
- This is a landscape of the infrastructure that powers the generative AI ecosystemβ149Updated 11 months ago
- InferX: Inference as a Service Platformβ136Updated last week
- ClearML Fractional GPU - Run multiple containers on the same GPU with driver level memory limitation β¨ and compute time-slicingβ80Updated last year
- β147Updated 2 weeks ago
- OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)β286Updated this week
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.β129Updated this week
- GPUd automates monitoring, diagnostics, and issue identification for GPUsβ436Updated last week