coreweave / ml-containersLinks
☆37Updated this week
Alternatives and similar repositories for ml-containers
Users that are interested in ml-containers are comparing it to the libraries listed below
Sorting:
- IBM development fork of https://github.com/huggingface/text-generation-inference☆61Updated last month
- Module, Model, and Tensor Serialization/Deserialization☆270Updated 2 months ago
- ☆258Updated this week
- Helm charts for llm-d☆50Updated 3 months ago
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆110Updated 3 weeks ago
- High-performance safetensors model loader☆67Updated this week
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆131Updated last month
- ☆31Updated 6 months ago
- A top-like tool for monitoring GPUs in a cluster☆85Updated last year
- CUDA checkpoint and restore utility☆377Updated last month
- Cloud Native Benchmarking of Foundation Models☆44Updated 2 months ago
- Benchmark suite for LLMs from Fireworks.ai☆82Updated last week
- A collection of reproducible inference engine benchmarks☆34Updated 6 months ago
- vLLM adapter for a TGIS-compatible gRPC server.☆41Updated this week
- xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…☆147Updated this week
- GPUd automates monitoring, diagnostics, and issue identification for GPUs☆440Updated this week
- xet client tech, used in huggingface_hub☆302Updated this week
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆130Updated this week
- ☆316Updated last year
- ☆56Updated 11 months ago
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated 2 years ago
- Home for OctoML PyTorch Profiler☆114Updated 2 years ago
- The driver for LMCache core to run in vLLM☆54Updated 8 months ago
- OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)☆292Updated last week
- ☆57Updated last week
- Transformer GPU VRAM estimator☆67Updated last year
- A benchmarking tool for comparing different LLM API providers' DeepSeek model deployments.☆29Updated 6 months ago
- Offline optimization of your disaggregated Dynamo graph☆79Updated this week
- GPU Environment Management for Visual Studio Code☆39Updated 2 years ago
- ☆15Updated last month