clearml / clearml-fractional-gpuLinks
ClearML Fractional GPU - Run multiple containers on the same GPU with driver level memory limitation ✨ and compute time-slicing
☆88Updated last month
Alternatives and similar repositories for clearml-fractional-gpu
Users that are interested in clearml-fractional-gpu are comparing it to the libraries listed below
Sorting:
- GPU environment and cluster management with LLM support☆657Updated last year
- Self-host LLMs with vLLM and BentoML☆163Updated last month
- A top-like tool for monitoring GPUs in a cluster☆84Updated last year
- Module, Model, and Tensor Serialization/Deserialization☆283Updated 4 months ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆62Updated 3 months ago
- ☆275Updated this week
- Repository for open inference protocol specification☆61Updated 8 months ago
- ☆67Updated 9 months ago
- ☆16Updated last month
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆139Updated last year
- The backend behind the LLM-Perf Leaderboard☆11Updated last year
- Where GPUs get cooked 👩🍳🔥☆347Updated 3 months ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆131Updated 3 months ago
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆214Updated 8 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆94Updated this week
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆204Updated this week
- Inference server benchmarking tool☆136Updated 3 months ago
- ⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.☆146Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆268Updated last month
- ☆282Updated 10 months ago
- ☆43Updated last week
- The Triton backend for the PyTorch TorchScript models.☆170Updated this week
- ☆198Updated last year
- Benchmark suite for LLMs from Fireworks.ai☆84Updated last month
- ClearML - Model-Serving Orchestration and Repository Solution☆160Updated last week
- A Lossless Compression Library for AI pipelines☆290Updated 6 months ago
- ☆27Updated this week
- vLLM adapter for a TGIS-compatible gRPC server.☆47Updated this week
- OpenAI compatible API for TensorRT LLM triton backend☆218Updated last year
- ☆18Updated last year