coreweave / ml-containersLinks
☆36Updated this week
Alternatives and similar repositories for ml-containers
Users that are interested in ml-containers are comparing it to the libraries listed below
Sorting:
- A top-like tool for monitoring GPUs in a cluster☆84Updated last year
- ☆221Updated this week
- Module, Model, and Tensor Serialization/Deserialization☆240Updated last week
- The driver for LMCache core to run in vLLM☆41Updated 4 months ago
- High-performance safetensors model loader☆39Updated 2 weeks ago
- Helm charts for llm-d☆42Updated this week
- NVIDIA NCCL Tests for Distributed Training☆97Updated this week
- CUDA checkpoint and restore utility☆345Updated 4 months ago
- A collection of reproducible inference engine benchmarks☆31Updated 2 months ago
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆177Updated 2 weeks ago
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated last year
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆98Updated this week
- ☆310Updated 10 months ago
- Pygloo provides Python bindings for Gloo.☆22Updated 3 months ago
- ☆28Updated 2 months ago
- Benchmark suite for LLMs from Fireworks.ai☆76Updated 2 weeks ago
- JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training☆48Updated last month
- xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…☆123Updated this week
- A toolkit for discovering cluster network topology.☆54Updated last week
- ☆55Updated 9 months ago
- vLLM adapter for a TGIS-compatible gRPC server.☆32Updated this week
- The NVIDIA GPU driver container allows the provisioning of the NVIDIA driver through the use of containers.☆113Updated this week
- WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.☆18Updated 2 years ago
- GPU Environment Management for Visual Studio Code☆38Updated last year
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆129Updated last month
- Large Language Model Text Generation Inference on Habana Gaudi☆33Updated 3 months ago
- Home for OctoML PyTorch Profiler☆113Updated 2 years ago
- ☆54Updated 7 months ago
- ☆21Updated 3 months ago
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆61Updated 2 months ago