mabdullahsoyturk/HPC-Paper-Notes

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mabdullahsoyturk/HPC-Paper-Notes)

mabdullahsoyturk / HPC-Paper-Notes

My notes on various HPC papers.

☆27

Alternatives and similar repositories for HPC-Paper-Notes

Users that are interested in HPC-Paper-Notes are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ParCoreLab / CPU-Free-model
View on GitHub
Source code for the CPU-Free model - a fully autonomous execution model for multi-GPU applications that completely excludes the involveme…
☆21Apr 25, 2024Updated 2 years ago
lixiuhong / implicit_gemm_convolution
View on GitHub
☆14May 28, 2019Updated 7 years ago
PanZaifeng / RecFlex
View on GitHub
A recommendation model kernel optimizing system
☆12Jun 5, 2025Updated last year
eth-cscs / spla
View on GitHub
Specialized Parallel Linear Algebra, providing distributed GEMM functionality for specific matrix distributions with optional GPU acceler…
☆32Jun 26, 2024Updated 2 years ago
matiaslindgren / cuda-memory-access-recorder
View on GitHub
Record GPU memory accesses of a CUDA program and visualize the access pattern in a browser
☆13Nov 17, 2020Updated 5 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
LesleyLai / CUDA-flocking-boid
View on GitHub
☆14Dec 29, 2020Updated 5 years ago
lashgar / ipmacc
View on GitHub
IPMACC is a framework for translating OpenACC for C API to CUDA, OpenCL, and Intel ISPC.
☆15Apr 21, 2026Updated 3 months ago
ricosjp / allgebra
View on GitHub
Base container for developing C++ and Fortran HPC applications
☆18Jun 14, 2022Updated 4 years ago
IcyCC / reimu
View on GitHub
基于EventLoop和多线程的morden cpp 的linux网络库
☆11Apr 5, 2020Updated 6 years ago
AlibabaResearch / recom
View on GitHub
An Optimizing Compiler for Recommendation Model Inference
☆26Jun 5, 2025Updated last year
chenshuaihao / KV_Store_Engine_TaurusDB_Race
View on GitHub
华为云TaurusDB性能挑战赛（HUAWEI TaurusDB Race）
☆10Aug 21, 2019Updated 6 years ago
TomHeaven / Dark-Channel-Haze-Removal-with-CUDA
View on GitHub
Dark channel Haze removal algorithm with CUDA acceleration (typically 10x or more speedup using a Nvidia GPU)
☆14Dec 7, 2017Updated 8 years ago
ParCoreLab / ComScribe
View on GitHub
ComScribe is a tool to identify communication among all GPU-GPU and CPU-GPU pairs in a single-node multi-GPU system.
☆27Jul 6, 2023Updated 3 years ago
brycelelbach / 2016_berkeley_cpp_summit_presentations
View on GitHub
Presentation materials for the 2016 Berkeley C++ Summit
☆14Oct 20, 2016Updated 9 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
eth-cscs / gpu-training
View on GitHub
☆10Jul 16, 2016Updated 10 years ago
eyalroz / gpu-kernel-runner
View on GitHub
Runs a single CUDA/OpenCL kernel, taking its source from a file and arguments from the command-line
☆26Jun 10, 2026Updated last month
dpankratz / TVMFuzz
View on GitHub
TVMFuzz: fuzzing tensor-level intermediate representation in TVM
☆32May 24, 2020Updated 6 years ago
LIS-Laboratory / cupc
View on GitHub
cuPC: CUDA-based Parallel PC Algorithm for Causal Structure Learning on GPU
☆16Mar 19, 2021Updated 5 years ago
kaletap / bfs-cuda-gpu
View on GitHub
Implementation of parallel Breadth First Algorithm for graph traversal using CUDA and C++ language.
☆35Dec 12, 2019Updated 6 years ago
illuhad / syclinfo
View on GitHub
List all available information about all SYCL devices and platforms
☆15Sep 14, 2020Updated 5 years ago
yester31 / Cutlass_EX
View on GitHub
study of cutlass
☆22Nov 10, 2024Updated last year
NTNU-HPC-Lab / BAT
View on GitHub
A GPU benchmark suite for autotuners
☆19Feb 20, 2024Updated 2 years ago
ProjectPhysX / PTXprofiler
View on GitHub
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
☆59Mar 20, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
ParCIS / Ok-Topk
View on GitHub
Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…
☆27Dec 10, 2022Updated 3 years ago
awsdocs / aws-cpp-developer-guide
View on GitHub
Content for the AWS SDK for C++ Developer Guide. For more info about the AWS C++ SDK, go to http://github.com/aws/aws-sdk-cpp
☆20Jun 15, 2023Updated 3 years ago
IcyCC / AboutMe
View on GitHub
about me
☆13Mar 10, 2022Updated 4 years ago
WenqiJiang / Convolution-Neural-Network-by-pyCUDA
View on GitHub
pyCUDA implementation of forward propagation for Convolutional Neural Networks
☆18Jan 4, 2019Updated 7 years ago
oneapi-src / ishmem
View on GitHub
Intel® SHMEM - Device initiated shared memory based communication library
☆33Nov 12, 2025Updated 8 months ago
Proof-Of-Hack-Protocol / challenges
View on GitHub
challenges-repository
☆14Nov 3, 2022Updated 3 years ago
nomp-org / libnomp
View on GitHub
libnomp is a loopy based runtime for C programming language to create domain specific compilers.
☆17Jun 25, 2026Updated last month
OpenPPL / ppl.kernel.cpu
View on GitHub
☆19Apr 6, 2024Updated 2 years ago
leimao / Doxygen-CPP-TriangleLib
View on GitHub
Using Doxygen to Document C++ Libraries
☆11Aug 4, 2020Updated 5 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
richursa / cpuBitonicSort
View on GitHub
openMP implementation of parallel bitonic sort
☆22Jun 17, 2019Updated 7 years ago
shamanDevel / cuMat
View on GitHub
An expression template based linear algebra library running completely on the GPU using CUDA
☆26Jun 24, 2021Updated 5 years ago
rybchuk / tutorial-towards-multinode-ml-training2025
View on GitHub
A tutorial showcasing the process of scaling up a SciML algorithm for multi-node training
☆16Sep 27, 2025Updated 9 months ago
bhonesh1998 / Hackerearth_Solutions
View on GitHub
This repository contains solutions of hackerearth.Problem name is same as file name and file contains solution.Solutions may be in c,c++,…
☆17Oct 15, 2019Updated 6 years ago
XiaoSongXS / HPC-Notes
View on GitHub
Personal Notes for Learning HPC & Parallel Computation [NO LONGER ADDING NEW CONTENT]
☆78Jul 29, 2022Updated 3 years ago
jdmccalpin / SKX-SF-Conflicts
View on GitHub
Repeated access to L2-containable loops to look for snoop filter conflicts on Intel Skylake Xeon processors.
☆30Aug 17, 2018Updated 7 years ago
PASSIONLab / ELBA
View on GitHub
Parallel String Graph Construction, Transitive Reduction, and Contig Generation for De Novo Genome Assembly
☆16Jun 11, 2024Updated 2 years ago