Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms.
☆40Mar 17, 2024Updated 2 years ago
Alternatives and similar repositories for MGG_OSDI23
Users that are interested in MGG_OSDI23 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Artifact for PPoPP22 QGTC: Accelerating Quantized GNN via GPU Tensor Core.☆30Feb 12, 2022Updated 4 years ago
- ☆42Jun 13, 2025Updated 9 months ago
- Artifact for OSDI'21 GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs.☆70Mar 2, 2023Updated 3 years ago
- A GPU algorithm for sparse matrix-matrix multiplication☆75Oct 1, 2020Updated 5 years ago
- ☆47Sep 5, 2022Updated 3 years ago
- A Factored System for Sample-based GNN Training over GPUs☆46Jul 26, 2023Updated 2 years ago
- Artifact for PPoPP20 "Understanding and Bridging the Gaps in Current GNN Performance Optimizations"☆41Nov 16, 2021Updated 4 years ago
- A reading list for deep graph learning acceleration.☆254Jul 26, 2025Updated 7 months ago
- Source code for the CPU-Free model - a fully autonomous execution model for multi-GPU applications that completely excludes the involveme…☆22Apr 25, 2024Updated last year
- Fast GPU based tensor core reductions☆13Jan 13, 2023Updated 3 years ago
- ☆15Feb 20, 2024Updated 2 years ago
- A dataflow architecture for universal graph neural network inference via multi-queue streaming.☆75Dec 19, 2022Updated 3 years ago
- ☆14Jan 12, 2022Updated 4 years ago
- SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization☆11Aug 12, 2020Updated 5 years ago
- PSTensor provides a way to hack the memory management of tensors in TensorFlow and PyTorch by defining your own C++ Tensor Class.☆10Feb 10, 2022Updated 4 years ago
- Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.☆92Nov 23, 2022Updated 3 years ago
- This repo is "NTHU Parallel Programing" course project.☆10Dec 5, 2017Updated 8 years ago
- Large scale graph learning on a single machine.☆167Feb 25, 2025Updated last year
- Multi-GPU dynamic scheduler using PGAS style cross-GPU communication☆29Jul 23, 2023Updated 2 years ago
- [MLSys 2022] "BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node …☆56Oct 6, 2023Updated 2 years ago
- LazyGNN: Large-Scale Graph Neural Networks via Lazy Propagation ICML_2023☆13Oct 27, 2023Updated 2 years ago
- A list of awesome GNN systems.☆337Updated this week
- We invite you to visit and follow our new repository at https://github.com/microsoft/TileFusion. TiledCUDA is a highly efficient kernel …☆192Jan 28, 2025Updated last year
- ☆22Mar 2, 2025Updated last year
- Artifact for USENIX ATC'23: TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.☆55Oct 16, 2023Updated 2 years ago
- PyTorch Codes for Haar Graph Pooling☆11Feb 16, 2023Updated 3 years ago
- Standardized higher-order datasets with corresponding datasheets☆19Aug 17, 2025Updated 7 months ago
- Horizontal Fusion☆24Jan 7, 2022Updated 4 years ago
- A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training☆22Sep 7, 2022Updated 3 years ago
- A Streaming-Native Serving Engine for TTS/STS Models☆60Feb 22, 2026Updated last month
- ☆23Oct 31, 2023Updated 2 years ago
- Sources for the Multi-Clock system as described in the paper: MULTI-CLOCK: Dynamic Tiering for Hybrid Memory Systems, HPCA 2022.☆19Mar 21, 2022Updated 4 years ago
- Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training☆24Mar 1, 2024Updated 2 years ago
- A High performance and tiny TVM graph executor library written in C which can compile to WebAssembly and use CUDA/WebGPU as the accelerat…☆12Aug 3, 2023Updated 2 years ago
- An ultra-fast, GPU-based large graph embedding algorithm utilizing a novel coarsening algorithm requiring not more than a single GPU.☆24Jan 3, 2022Updated 4 years ago
- Graphiler is a compiler stack built on top of DGL and TorchScript which compiles GNNs defined using user-defined functions (UDFs) into ef…☆59Oct 3, 2022Updated 3 years ago
- ☆18Mar 4, 2025Updated last year
- ☆13Jan 23, 2021Updated 5 years ago
- LLM serving cluster simulator☆138Apr 25, 2024Updated last year