rapidsai / wholegraphLinks
WholeGraph - large scale Graph Neural Networks
☆106Updated last year
Alternatives and similar repositories for wholegraph
Users that are interested in wholegraph are comparing it to the libraries listed below
Sorting:
- ☆70Updated 4 years ago
- PyTorch Library for Low-Latency, High-Throughput Graph Learning on GPUs.☆302Updated 2 years ago
- distributed-embeddings is a library for building large embedding based models in Tensorflow 2.☆46Updated 2 years ago
- Large scale graph learning on a single machine.☆167Updated 10 months ago
- Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture (accepted by PVLDB)☆44Updated 2 years ago
- The official SALIENT system described in the paper "Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and P…☆40Updated 2 years ago
- Set of datasets for the deep learning recommendation model (DLRM).☆48Updated 3 years ago
- A Python library transfers PyTorch tensors between CPU and NVMe☆123Updated last year
- Graphiler is a compiler stack built on top of DGL and TorchScript which compiles GNNs defined using user-defined functions (UDFs) into ef…☆59Updated 3 years ago
- ☆112Updated 4 years ago
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆367Updated last week
- oneCCL Bindings for Pytorch* (deprecated)☆104Updated last week
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆162Updated 2 weeks ago
- Home for OctoML PyTorch Profiler☆114Updated 2 years ago
- Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…☆41Updated last year
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆63Updated 6 months ago
- PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…☆154Updated 3 weeks ago
- ☆64Updated this week
- Distributed preprocessing and data loading for language datasets☆40Updated last year
- [MLSys 2022] "BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node …☆56Updated 2 years ago
- Research and development for optimizing transformers☆131Updated 4 years ago
- NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process com…☆433Updated last week
- Artifact evaluation of the paper "Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and Pipelining"☆23Updated 3 years ago
- Github mirror of trition-lang/triton repo.☆113Updated last week
- HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of…☆186Updated 2 months ago
- CUDA Embedding Lookup Kernel Library☆40Updated 2 months ago
- ☆115Updated last year
- A schedule language for large model training☆152Updated 4 months ago
- Samples demonstrating how to use the Compute Sanitizer Tools and Public API☆92Updated 2 years ago
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆141Updated 2 years ago