clearml / clearml-fractional-gpuLinks
ClearML Fractional GPU - Run multiple containers on the same GPU with driver level memory limitation ✨ and compute time-slicing
☆88Updated last month
Alternatives and similar repositories for clearml-fractional-gpu
Users that are interested in clearml-fractional-gpu are comparing it to the libraries listed below
Sorting:
- GPU environment and cluster management with LLM support☆656Updated last year
- A top-like tool for monitoring GPUs in a cluster☆85Updated last year
- ☆273Updated this week
- Module, Model, and Tensor Serialization/Deserialization☆279Updated 4 months ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆62Updated 3 months ago
- ☆66Updated 8 months ago
- Self-host LLMs with vLLM and BentoML☆161Updated 3 weeks ago
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆119Updated last week
- The backend behind the LLM-Perf Leaderboard☆11Updated last year
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆139Updated last year
- ⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.☆146Updated last year
- Repository for open inference protocol specification☆61Updated 7 months ago
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆216Updated 7 months ago
- Where GPUs get cooked 👩🍳🔥☆335Updated 3 months ago
- ClearML - Model-Serving Orchestration and Repository Solution☆159Updated last month
- vLLM adapter for a TGIS-compatible gRPC server.☆45Updated this week
- Inference server benchmarking tool☆130Updated 2 months ago
- ☆16Updated 3 weeks ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆131Updated 2 months ago
- TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models access…☆114Updated last year
- Benchmark suite for LLMs from Fireworks.ai☆84Updated 3 weeks ago
- ☆283Updated 9 months ago
- Distributed Model Serving Framework☆180Updated 2 months ago
- ☆42Updated last week
- ☆198Updated last year
- OpenAI compatible API for TensorRT LLM triton backend☆218Updated last year
- ClearML Agent - ML-Ops made easy. ML-Ops scheduler & orchestration solution☆282Updated last month
- ScalarLM - a unified training and inference stack☆93Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMs☆267Updated 2 weeks ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆94Updated this week