paperg / NCCL_GP
Separate from hardware and used to learn some NCCL mechanisms
☆14Updated 10 months ago
Alternatives and similar repositories for NCCL_GP:
Users that are interested in NCCL_GP are comparing it to the libraries listed below
- NCCL Profiling Kit☆127Updated 7 months ago
- ☆142Updated last month
- Repository for MLCommons Chakra schema and tools☆84Updated 2 weeks ago
- Synthesizer for optimal collective communication algorithms☆103Updated 10 months ago
- ☆28Updated last month
- Paella: Low-latency Model Serving with Virtualized GPU Scheduling☆58Updated 9 months ago
- ☆75Updated 2 years ago
- ☆97Updated 2 months ago
- RDMA and SHARP plugins for nccl library☆176Updated 3 weeks ago
- Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“☆60Updated 8 months ago
- ☆130Updated 11 months ago
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆79Updated last year
- Hooked CUDA-related dynamic libraries by using automated code generation tools.☆145Updated last year
- ☆20Updated this week
- High performance Transformer implementation in C++.☆101Updated last month
- PyTorch distributed training acceleration framework☆39Updated this week