nvidia-china-sae / WholeGraph
☆11Updated 3 years ago
Related projects: ⓘ
- ☆71Updated 3 years ago
- PyTorch-Direct code on top of PyTorch-1.8.0nightly (e152ca5) for Large Graph Convolutional Network Training with GPU-Oriented Data Commun…☆45Updated last year
- Light-weight GPU kernel interface for graph operations☆15Updated 4 years ago
- Graphiler is a compiler stack built on top of DGL and TorchScript which compiles GNNs defined using user-defined functions (UDFs) into ef…☆58Updated last year
- PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021☆54Updated 3 years ago
- LazyGCN☆9Updated 3 years ago
- [MLSys 2022] "BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node …☆51Updated 11 months ago
- ICLR 2021☆42Updated 3 years ago
- An Attention Superoptimizer☆19Updated 4 months ago
- Inference framework for MoE layers based on TensorRT with Python binding☆41Updated 3 years ago
- A high-performance distributed deep learning system targeting large-scale and automated distributed training. If you have any interests, …☆101Updated 9 months ago
- Implementation of FusedMM method for IPDPS 2021 paper titled "FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph Neural N…☆26Updated 2 years ago
- ☆50Updated 3 months ago
- The official SALIENT system described in the paper "Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and P…☆38Updated last year
- A Python library transfers PyTorch tensors between CPU and NVMe☆92Updated last year
- Memory Optimizations for Deep Learning (ICML 2023)☆58Updated 6 months ago
- ☆23Updated 9 months ago
- ☆96Updated 3 years ago
- WholeGraph - large scale Graph Neural Networks☆97Updated last week
- (NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.☆33Updated last year
- Artifact for PPoPP20 "Understanding and Bridging the Gaps in Current GNN Performance Optimizations"☆39Updated 2 years ago
- ☆83Updated 3 weeks ago
- Distributed DataLoader For Pytorch Based On Ray☆24Updated 2 years ago
- Largest realworld open-source graph dataset - Worked done under IBM-Illinois Discovery Accelerator Institute and Amazon Research Awards a…☆74Updated last week
- ☆14Updated last year
- ☆44Updated 2 years ago
- Odysseus: Playground of LLM Sequence Parallelism☆50Updated 3 months ago
- [MLSys 2023] Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models☆16Updated last year
- ☆34Updated 3 months ago
- Set of datasets for the deep learning recommendation model (DLRM).☆39Updated last year