ntuhpc / training-ay1819
sample code/text used in NTU HPC Internal Training during AY2018-2019
☆24Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for training-ay1819
- Seminar on selected tools in Computer Science☆24Updated 3 years ago
- IMPACT GPU Algorithms Teaching Labs☆55Updated last year
- An implementation of HPL-AI Mixed-Precision Benchmark based on hpl-2.3☆27Updated 3 years ago
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆114Updated 2 years ago
- My paper/code reading notes in Chinese☆45Updated 6 months ago
- An efficient concurrent graph processing system☆46Updated 3 years ago
- Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training☆20Updated 8 months ago
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆31Updated 3 years ago
- Analysis for the traces from byteprofile☆29Updated last year
- Instructions, Docker images, and examples for Nsight Compute and Nsight Systems☆128Updated 4 years ago
- Graph Sampling using GPU☆51Updated 2 years ago
- A hybrid partitioner based quantum circuit simulation system on GPU☆47Updated 2 years ago
- GPU Performance Advisor☆63Updated 2 years ago
- Code for paper "Design Principles for Sparse Matrix Multiplication on the GPU" accepted to Euro-Par 2018☆71Updated 4 years ago
- Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA☆31Updated 4 years ago
- Artifact for PPoPP20 "Understanding and Bridging the Gaps in Current GNN Performance Optimizations"☆39Updated 3 years ago
- Repository holding the code base to AC-SpGEMM : "Adaptive Sparse Matrix-Matrix Multiplication on the GPU"☆28Updated 4 years ago
- Modified version of PyTorch able to work with changes to GPGPU-Sim☆45Updated 2 years ago
- GVProf: A Value Profiler for GPU-based Clusters☆48Updated 7 months ago
- ☆101Updated 3 years ago
- Graphiler is a compiler stack built on top of DGL and TorchScript which compiles GNNs defined using user-defined functions (UDFs) into ef…☆60Updated 2 years ago
- Implementation of FusedMM method for IPDPS 2021 paper titled "FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph Neural N…☆28Updated 2 years ago
- PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity☆99Updated this week
- ☆21Updated last year
- Artifact for USENIX ATC'23: TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.☆45Updated last year
- Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)☆102Updated last year
- A GPU FP32 computation method with Tensor Cores.☆18Updated 2 years ago
- parser script to process pytorch autograd profiler result, convert json file to excel.☆12Updated 5 years ago
- PTX-EMU is a simple emulator for CUDA program.☆24Updated 10 months ago
- A repository where GPU applications are aggregated using a common build flow that supports multiple CUDA versions.☆45Updated last month