czkkkkkk / gcclView external linksLinks
☆13Jan 23, 2021Updated 5 years ago
Alternatives and similar repositories for gccl
Users that are interested in gccl are comparing it to the libraries listed below
Sorting:
- ☆10Nov 14, 2023Updated 2 years ago
- ☆49Apr 11, 2025Updated 10 months ago
- An Attention Superoptimizer☆22Jan 20, 2025Updated last year
- this is the release repository of superneurons☆54Feb 13, 2021Updated 5 years ago
- ☆16May 4, 2021Updated 4 years ago
- ☆19Aug 26, 2021Updated 4 years ago
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Apr 15, 2022Updated 3 years ago
- Large scale graph learning on a single machine.☆167Feb 25, 2025Updated 11 months ago
- Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture (accepted by PVLDB)☆44Jul 1, 2023Updated 2 years ago
- Getting Starting with NIMBUS-CORE☆10Dec 16, 2023Updated 2 years ago
- ☆19Jul 1, 2020Updated 5 years ago
- Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]☆25Nov 21, 2024Updated last year
- Graphiler is a compiler stack built on top of DGL and TorchScript which compiles GNNs defined using user-defined functions (UDFs) into ef…☆59Oct 3, 2022Updated 3 years ago
- A Distributed System for GNN Training☆123Sep 29, 2024Updated last year
- Medusa: Building GPU-based Parallel Sparse Graph Applications with Sequential C/C++ Code☆63Oct 17, 2020Updated 5 years ago
- A pattern-based algorithmic autotuner for graph processing on GPUs.☆32Jun 25, 2025Updated 7 months ago
- Artifact for PPoPP22 QGTC: Accelerating Quantized GNN via GPU Tensor Core.☆30Feb 12, 2022Updated 4 years ago
- GPU Optimization and Memory Abstraction Framework☆32Oct 31, 2019Updated 6 years ago
- ☆11Jan 3, 2023Updated 3 years ago
- Unified Sparse Library Wrapper Based on cuSPARSE☆12May 24, 2022Updated 3 years ago
- FTPipe and related pipeline model parallelism research.☆44May 16, 2023Updated 2 years ago
- ☆30Feb 12, 2025Updated last year
- Datacenter simulation toolkit for the OpenDC project☆10Aug 24, 2020Updated 5 years ago
- ☆36Jun 10, 2024Updated last year
- Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.☆91Nov 23, 2022Updated 3 years ago
- mallocMC: Memory Allocator for Many Core Architectures☆58Feb 2, 2026Updated 2 weeks ago
- ☆38Jan 15, 2021Updated 5 years ago
- [ICML 2025] Efficiently Serving Large Multimodal Models Using EPD Disaggregation☆22May 29, 2025Updated 8 months ago
- ☆14Dec 13, 2023Updated 2 years ago
- ☆11Nov 14, 2023Updated 2 years ago
- An efficient storage system for concurrent graph processing☆10Feb 1, 2021Updated 5 years ago
- Unifies OS page cache for heterogeneous systems☆12Jul 26, 2019Updated 6 years ago
- Custom Scheduler to deploy ML models to TRTIS for GPU Sharing☆11Apr 1, 2020Updated 5 years ago
- Quicksilver superpage management system☆11May 14, 2021Updated 4 years ago
- Neural machine translation with Recurrent Deterministic Policy Gradient☆10Aug 18, 2016Updated 9 years ago
- Efficient-Tensor-Management-on-HM-for-Deep-Learning☆10Nov 15, 2021Updated 4 years ago
- ☆10Aug 2, 2021Updated 4 years ago
- Multi-GPU Computing Benchmark Suite (CUDA)☆43Jun 12, 2017Updated 8 years ago
- Parallel k-core Decomposition on Multicore Platforms☆11Oct 12, 2020Updated 5 years ago