rapidsai / wholegraph
WholeGraph - large scale Graph Neural Networks
☆101Updated 2 months ago
Alternatives and similar repositories for wholegraph:
Users that are interested in wholegraph are comparing it to the libraries listed below
- ☆73Updated 3 years ago
- The official SALIENT system described in the paper "Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and P…☆38Updated last year
- ☆104Updated 3 years ago
- A Python library transfers PyTorch tensors between CPU and NVMe☆102Updated 2 months ago
- Graphiler is a compiler stack built on top of DGL and TorchScript which compiles GNNs defined using user-defined functions (UDFs) into ef…☆61Updated 2 years ago
- Graph Sampling using GPU☆51Updated 2 years ago
- distributed-embeddings is a library for building large embedding based models in Tensorflow 2.☆43Updated last year
- PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity☆100Updated last week
- HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of…☆137Updated 3 weeks ago
- Artifact for PPoPP20 "Understanding and Bridging the Gaps in Current GNN Performance Optimizations"☆39Updated 3 years ago
- Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…☆38Updated 10 months ago
- PyTorch-Direct code on top of PyTorch-1.8.0nightly (e152ca5) for Large Graph Convolutional Network Training with GPU-Oriented Data Commun…☆45Updated last year
- A GPU-accelerated graph learning library for PyTorch, facilitating the scaling of GNN training and inference.☆125Updated 2 months ago
- MSCCL++: A GPU-driven communication stack for scalable AI applications☆291Updated this week
- A fast communication-overlapping library for tensor parallelism on GPUs.☆280Updated 3 months ago
- PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…☆128Updated last week
- Microsoft Collective Communication Library☆61Updated 2 months ago
- ☆36Updated 7 months ago
- Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity☆195Updated last year
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆117Updated 2 years ago
- Artifact evaluation of the paper "Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and Pipelining"☆24Updated 2 years ago
- PyTorch Library for Low-Latency, High-Throughput Graph Learning on GPUs.☆296Updated last year
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆133Updated last year
- ☆46Updated 2 years ago
- oneCCL Bindings for Pytorch*☆87Updated 3 weeks ago
- Automated Parallelization System and Infrastructure for Multiple Ecosystems☆77Updated 2 months ago
- Synthesizer for optimal collective communication algorithms☆102Updated 9 months ago
- Distributed preprocessing and data loading for language datasets☆39Updated 9 months ago
- FlexFlow Serve: Low-Latency, High-Performance LLM Serving☆17Updated this week
- ☆180Updated 6 months ago