coreweave / ml-containersLinks
☆39Updated this week
Alternatives and similar repositories for ml-containers
Users that are interested in ml-containers are comparing it to the libraries listed below
Sorting:
- ☆239Updated last week
- Module, Model, and Tensor Serialization/Deserialization☆264Updated 3 weeks ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆61Updated 4 months ago
- Helm charts for llm-d☆50Updated last month
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆108Updated last week
- ☆31Updated 4 months ago
- xet client tech, used in huggingface_hub☆204Updated this week
- High-performance safetensors model loader☆55Updated last month
- The driver for LMCache core to run in vLLM☆49Updated 7 months ago
- CUDA checkpoint and restore utility☆366Updated 7 months ago
- Simple dependency injection framework for Python☆21Updated last year
- ☆315Updated last year
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆132Updated last week
- OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)☆261Updated last week
- Benchmark suite for LLMs from Fireworks.ai☆83Updated last week
- Common recipes to run vLLM☆125Updated this week
- A benchmarking tool for comparing different LLM API providers' DeepSeek model deployments.☆29Updated 5 months ago
- ☆55Updated 9 months ago
- Distributed KV cache coordinator☆68Updated this week
- vLLM adapter for a TGIS-compatible gRPC server.☆39Updated this week
- This is a landscape of the infrastructure that powers the generative AI ecosystem☆149Updated 10 months ago
- xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…☆141Updated this week
- Getting Started with the CoreWeave Kubernetes GPU Cloud☆74Updated 3 months ago
- GPUd automates monitoring, diagnostics, and issue identification for GPUs☆425Updated this week
- NVIDIA NCCL Tests for Distributed Training☆110Updated last week
- Distributed Model Serving Framework☆177Updated 3 weeks ago
- Transformer GPU VRAM estimator☆66Updated last year
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated 2 years ago
- [⛔️ DEPRECATED] Friendli: the fastest serving engine for generative AI☆48Updated 2 months ago
- Repository for open inference protocol specification☆59Updated 4 months ago