chhzh123 / Krill
An efficient concurrent graph processing system
☆46Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for Krill
- Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS☆17Updated 2 years ago
- My paper/code reading notes in Chinese☆45Updated 6 months ago
- Graph Sampling using GPU☆51Updated 2 years ago
- ngAP's artifact for ASPLOS'24☆19Updated 3 weeks ago
- GVProf: A Value Profiler for GPU-based Clusters☆47Updated 7 months ago
- Artifact for OSDI'21 GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs.☆63Updated last year
- A GPU-accelerated DNN inference serving system that supports instant kernel preemption and biased concurrent execution in GPU scheduling.☆39Updated 2 years ago
- Vector search with bounded performance.☆33Updated 9 months ago
- ☆72Updated 3 years ago
- Graphiler is a compiler stack built on top of DGL and TorchScript which compiles GNNs defined using user-defined functions (UDFs) into ef…☆60Updated 2 years ago
- High performance RDMA-based distributed feature collection component for training GNN model on EXTREMELY large graph☆48Updated 2 years ago
- A Framework for Graph Sampling and Random Walk on GPUs.☆38Updated 2 years ago
- Artifact for PPoPP20 "Understanding and Bridging the Gaps in Current GNN Performance Optimizations"☆39Updated 3 years ago
- Artifact of ASPLOS'23 paper entitled: GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference☆16Updated last year
- ☆18Updated 4 years ago
- Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…☆37Updated 8 months ago
- Seminar on selected tools in Computer Science☆24Updated 3 years ago
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆22Updated last month
- ☆21Updated last year
- ☆30Updated 9 months ago
- Dorylus: Affordable, Scalable, and Accurate GNN Training☆77Updated 3 years ago
- A pattern-based algorithmic autotuner for graph processing on GPUs.☆30Updated last year
- Rebuild YatSenOS On RISC-V 64.☆19Updated 2 years ago
- A Factored System for Sample-based GNN Training over GPUs☆42Updated last year
- Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA☆31Updated 4 years ago
- ☆73Updated last year
- Implementation of FusedMM method for IPDPS 2021 paper titled "FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph Neural N…☆28Updated 2 years ago
- A tool for examining GPU scheduling behavior.☆70Updated 3 months ago
- Code for paper "Design Principles for Sparse Matrix Multiplication on the GPU" accepted to Euro-Par 2018☆71Updated 4 years ago
- FlashMob is a shared-memory random walk system.☆31Updated last year