clearml / clearml-fractional-gpuLinks
ClearML Fractional GPU - Run multiple containers on the same GPU with driver level memory limitation ✨ and compute time-slicing
☆80Updated last year
Alternatives and similar repositories for clearml-fractional-gpu
Users that are interested in clearml-fractional-gpu are comparing it to the libraries listed below
Sorting:
- Module, Model, and Tensor Serialization/Deserialization☆267Updated last month
- ☆255Updated last week
- GPU environment and cluster management with LLM support☆642Updated last year
- A top-like tool for monitoring GPUs in a cluster☆85Updated last year
- Self-host LLMs with vLLM and BentoML☆150Updated 2 weeks ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆61Updated 3 weeks ago
- ☆64Updated 6 months ago
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆211Updated 5 months ago
- Inference server benchmarking tool☆114Updated last week
- OpenAI compatible API for TensorRT LLM triton backend☆215Updated last year
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆138Updated last year
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆110Updated last week
- Where GPUs get cooked 👩🍳🔥☆293Updated 3 weeks ago
- A tool to configure, launch and manage your machine learning experiments.☆197Updated this week
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆132Updated 3 weeks ago
- vLLM adapter for a TGIS-compatible gRPC server.☆41Updated this week
- Google TPU optimizations for transformers models☆120Updated 8 months ago
- ⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.☆145Updated last year
- ☆278Updated 7 months ago
- Controller for ModelMesh☆237Updated 4 months ago
- Benchmark suite for LLMs from Fireworks.ai☆83Updated last week
- A collection of all available inference solutions for the LLMs☆91Updated 7 months ago
- ☆197Updated last year
- ClearML - Model-Serving Orchestration and Repository Solution☆157Updated last week
- ☆300Updated this week
- ☆15Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMs☆265Updated last year
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆90Updated this week
- Ray - A curated list of resources: https://github.com/ray-project/ray☆69Updated 3 months ago
- Distributed Model Serving Framework☆178Updated last week