Solution of Programming Massively Parallel Processors
☆49Jan 15, 2024Updated 2 years ago
Alternatives and similar repositories for Programming-Massively-Parallel-Processors
Users that are interested in Programming-Massively-Parallel-Processors are comparing it to the libraries listed below
Sorting:
- Examples and exercises from the book Programming Massively Parallel Processors - A Hands-on Approach. David B. Kirk and Wen-mei W. Hwu (T…☆77Jan 21, 2021Updated 5 years ago
- ☆23Jun 11, 2025Updated 8 months ago
- CUDA 6大并行计算模式 代码与笔记☆61Jul 30, 2020Updated 5 years ago
- A docker image for One Student One Chip's debug exam☆10Sep 22, 2023Updated 2 years ago
- High performance RMSNorm Implement by using SM Core Storage(Registers and Shared Memory)☆27Jan 22, 2026Updated last month
- Repository for answers for exercises in Programming Massively Parallel Processors book☆16Aug 10, 2024Updated last year
- ☆15Apr 15, 2022Updated 3 years ago
- Some CUDA projects and utility☆27Nov 7, 2019Updated 6 years ago
- My study note for mlsys☆14Nov 4, 2024Updated last year
- Xiangshan deterministic workloads generator☆24May 14, 2025Updated 9 months ago
- A toy SysY compiler for the PKU compiler course project, 2023 spring.☆13Aug 31, 2023Updated 2 years ago
- ☆19May 17, 2016Updated 9 years ago
- ☆29Aug 29, 2023Updated 2 years ago
- performance engineering☆30Jul 11, 2024Updated last year
- OSDI 2023 Welder, deeplearning compiler☆32Nov 24, 2023Updated 2 years ago
- ☆35Apr 10, 2024Updated last year
- Luthier, a GPU binary instrumentation tool for AMD GPUs☆27Feb 21, 2026Updated last week
- Code base and slides for ECE408:Applied Parallel Programming On GPU.☆145Jul 2, 2021Updated 4 years ago
- Fast OS-level support for GPU checkpoint and restore☆271Sep 28, 2025Updated 5 months ago
- ☆168Updated this week
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆106Jun 28, 2025Updated 8 months ago
- Framework to reduce autotune overhead to zero for well known deployments.☆96Sep 19, 2025Updated 5 months ago
- [ACL2025 Oral🔥]Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling☆22Nov 11, 2025Updated 3 months ago
- ☆20May 24, 2025Updated 9 months ago
- 使用 cutlass 实现 flash-attention 精简版,具有教学意义☆57Aug 12, 2024Updated last year
- ArterialNet reconstructs arterial blood pressure (ABP) waveform☆13Feb 24, 2025Updated last year
- a vue-demo:vue仿网易新闻m站☆10Jul 26, 2017Updated 8 years ago
- 一起来数三角形吧!☆10Jun 27, 2024Updated last year
- Implementation of Selective Clustering Annotated using Modes of Projections☆11May 19, 2020Updated 5 years ago
- custom controller☆11Jan 3, 2024Updated 2 years ago
- Using machine learning to reverse-engineer synth presets from raw audio.☆13Jan 31, 2019Updated 7 years ago
- ☆13May 8, 2025Updated 9 months ago
- A dataset of 173 progressive metal songs, in both GuitarPro and token formats, as per the specifications in DadaGP.☆17Nov 19, 2024Updated last year
- This repository is the summary of all of our works for the XLA.☆11Jan 14, 2018Updated 8 years ago
- Open source simulator for porous media flow☆14Oct 15, 2022Updated 3 years ago
- ☆11Dec 23, 2025Updated 2 months ago
- ☆14Oct 30, 2024Updated last year
- Generate Linux Perf event tables for Apple Silicon☆17Dec 16, 2025Updated 2 months ago
- RISC-V vector and tensor compute extensions for Vortex GPGPU acceleration for ML workloads. Optimized for transformer models, CNNs, and g…☆21Apr 25, 2025Updated 10 months ago