clearml / clearml-fractional-gpuLinks
ClearML Fractional GPU - Run multiple containers on the same GPU with driver level memory limitation ✨ and compute time-slicing
☆83Updated last week
Alternatives and similar repositories for clearml-fractional-gpu
Users that are interested in clearml-fractional-gpu are comparing it to the libraries listed below
Sorting:
- ☆267Updated this week
- A top-like tool for monitoring GPUs in a cluster☆85Updated last year
- Module, Model, and Tensor Serialization/Deserialization☆273Updated 3 months ago
- GPU environment and cluster management with LLM support☆652Updated last year
- IBM development fork of https://github.com/huggingface/text-generation-inference☆62Updated 2 months ago
- Self-host LLMs with vLLM and BentoML☆158Updated 3 weeks ago
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆139Updated last year
- ☆64Updated 7 months ago
- ☆16Updated 2 months ago
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆213Updated 7 months ago
- The Triton backend for the PyTorch TorchScript models.☆165Updated this week
- Where GPUs get cooked 👩🍳🔥☆317Updated 2 months ago
- Benchmark suite for LLMs from Fireworks.ai☆83Updated last week
- Repository for open inference protocol specification☆59Updated 6 months ago
- ☆42Updated 3 weeks ago
- The backend behind the LLM-Perf Leaderboard☆11Updated last year
- A collection of all available inference solutions for the LLMs☆92Updated 8 months ago
- Inference server benchmarking tool☆128Updated last month
- ☆313Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆267Updated last year
- A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM☆70Updated this week
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆201Updated this week
- The Triton backend for the ONNX Runtime.☆167Updated last week
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆131Updated 2 months ago
- Distributed Model Serving Framework☆178Updated last month
- Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT scenarios.☆104Updated last year
- OpenAI compatible API for TensorRT LLM triton backend☆217Updated last year
- vLLM adapter for a TGIS-compatible gRPC server.☆44Updated this week
- A tool to configure, launch and manage your machine learning experiments.☆208Updated this week
- xet client tech, used in huggingface_hub☆322Updated last week