rapidsai / wholegraph
WholeGraph - large scale Graph Neural Networks
☆101Updated 4 months ago
Alternatives and similar repositories for wholegraph:
Users that are interested in wholegraph are comparing it to the libraries listed below
- ☆73Updated 3 years ago
- ☆107Updated 3 years ago
- Graphiler is a compiler stack built on top of DGL and TorchScript which compiles GNNs defined using user-defined functions (UDFs) into ef…☆61Updated 2 years ago
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆135Updated last year
- distributed-embeddings is a library for building large embedding based models in Tensorflow 2.☆43Updated last year
- A Python library transfers PyTorch tensors between CPU and NVMe☆111Updated 4 months ago
- PyTorch-Direct code on top of PyTorch-1.8.0nightly (e152ca5) for Large Graph Convolutional Network Training with GPU-Oriented Data Commun…☆45Updated last year
- HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of…☆141Updated 3 weeks ago
- The official SALIENT system described in the paper "Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and P…☆38Updated last year
- Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…☆42Updated last year
- PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity☆107Updated last week
- An experimental parallel training platform☆54Updated last year
- ☆141Updated last month
- Artifact for PPoPP20 "Understanding and Bridging the Gaps in Current GNN Performance Optimizations"☆39Updated 3 years ago
- Graph Sampling using GPU☆51Updated 3 years ago
- Code for paper "Design Principles for Sparse Matrix Multiplication on the GPU" accepted to Euro-Par 2018☆73Updated 4 years ago
- Samples demonstrating how to use the Compute Sanitizer Tools and Public API☆76Updated last year
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing. By pro…☆70Updated this week
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆119Updated 2 years ago
- ☆76Updated 2 years ago
- High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.☆101Updated 8 months ago
- ☆44Updated last year
- A library of GPU kernels for sparse matrix operations.☆260Updated 4 years ago
- We invite you to visit and follow our new repository at https://github.com/microsoft/TileFusion. TiledCUDA is a highly efficient kernel …☆179Updated 2 months ago
- Distributed Multi-GPU GNN Framework☆37Updated 4 years ago
- ☆49Updated 5 years ago
- ☆193Updated 8 months ago
- Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …☆64Updated 3 years ago
- Artifact evaluation of the paper "Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and Pipelining"☆25Updated 3 years ago
- Automated Parallelization System and Infrastructure for Multiple Ecosystems☆78Updated 4 months ago