NVIDIA / cuda-checkpoint
CUDA checkpoint and restore utility
☆224Updated 7 months ago
Related projects ⓘ
Alternatives and complementary repositories for cuda-checkpoint
- NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.☆112Updated last year
- NCCL Profiling Kit☆112Updated 4 months ago
- cricket is a virtualization solution for GPUs☆153Updated 10 months ago
- NVIDIA NCCL Tests for Distributed Training☆70Updated 2 weeks ago
- Efficient and easy multi-instance LLM serving☆213Updated this week
- Module, Model, and Tensor Serialization/Deserialization☆188Updated last month
- ☆271Updated 3 months ago
- Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the…☆270Updated last week
- ☆214Updated this week
- MSCCL++: A GPU-driven communication stack for scalable AI applications☆250Updated this week
- A library to analyze PyTorch traces.☆307Updated this week
- An interference-aware scheduler for fine-grained GPU sharing☆110Updated 6 months ago
- A low-latency & high-throughput serving engine for LLMs☆245Updated 2 months ago
- Microsoft Collective Communication Library☆53Updated last month
- Ultra-Fast and Cheaper Long-Context LLM Inference☆233Updated this week
- PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…☆124Updated this week
- This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.☆147Updated this week
- Cloud Native Benchmarking of Foundation Models☆21Updated 2 weeks ago
- RDMA and SHARP plugins for nccl library☆162Updated last week
- An efficient GPU resource sharing system with fine-grained control for Linux platforms.☆73Updated 7 months ago
- A resilient distributed training framework☆85Updated 7 months ago
- ☆36Updated 2 months ago
- MIG Partition Editor for NVIDIA GPUs☆174Updated this week
- A tool for bandwidth measurements on NVIDIA GPUs.☆321Updated last month
- Hooked CUDA-related dynamic libraries by using automated code generation tools.☆139Updated 11 months ago
- SpotServe: Serving Generative Large Language Models on Preemptible Instances☆101Updated 8 months ago
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆232Updated this week
- Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“☆57Updated 5 months ago
- Paella: Low-latency Model Serving with Virtualized GPU Scheduling☆57Updated 6 months ago
- NVIDIA GPUDirect Storage Driver☆202Updated this week