coreweave / ml-containersLinks
☆34Updated last week
Alternatives and similar repositories for ml-containers
Users that are interested in ml-containers are comparing it to the libraries listed below
Sorting:
- Module, Model, and Tensor Serialization/Deserialization☆232Updated last week
- Helm charts for llm-d☆35Updated this week
- ☆214Updated this week
- IBM development fork of https://github.com/huggingface/text-generation-inference☆60Updated 3 weeks ago
- NVIDIA NCCL Tests for Distributed Training☆91Updated last week
- A top-like tool for monitoring GPUs in a cluster☆85Updated last year
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆95Updated 2 weeks ago
- ☆12Updated last year
- CUDA checkpoint and restore utility☆339Updated 4 months ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆127Updated last month
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆169Updated this week
- JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training☆45Updated last week
- 🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.☆10Updated last month
- Large Language Model Text Generation Inference on Habana Gaudi☆33Updated 2 months ago
- Repository for open inference protocol specification☆56Updated 3 weeks ago
- vLLM adapter for a TGIS-compatible gRPC server.☆30Updated this week
- Inference server benchmarking tool☆67Updated last month
- Holistic job manager on Kubernetes☆115Updated last year
- Home for OctoML PyTorch Profiler☆113Updated 2 years ago
- Cloud Native Benchmarking of Foundation Models☆34Updated 2 weeks ago
- A collection of reproducible inference engine benchmarks☆31Updated last month
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated last year
- ☆53Updated 8 months ago
- 📡 Deploy AI models and apps to Kubernetes without developing a hernia☆32Updated last year
- The driver for LMCache core to run in vLLM☆41Updated 3 months ago
- ☆27Updated last month
- Simple dependency injection framework for Python☆21Updated last year
- Distributed ML Optimizer☆32Updated 3 years ago
- xet client tech, used in huggingface_hub☆107Updated this week
- The Triton backend for the PyTorch TorchScript models.☆149Updated 2 weeks ago