Examples and exercises from the book Programming Massively Parallel Processors - A Hands-on Approach. David B. Kirk and Wen-mei W. Hwu (Third Edition)
☆77Jan 21, 2021Updated 5 years ago
Alternatives and similar repositories for pmpp__programming_massively_parallel_processors
Users that are interested in pmpp__programming_massively_parallel_processors are comparing it to the libraries listed below
Sorting:
- ☆220Aug 2, 2024Updated last year
- Exploring how optimizations for GEMMs work☆28Jan 1, 2026Updated 2 months ago
- ☆16May 14, 2025Updated 9 months ago
- A implement of run-length encoding for Pytorch tensor using CUDA☆14Apr 7, 2021Updated 4 years ago
- ECE408 (Applied Parallel Programming) Fall 2022 MP☆19Mar 24, 2023Updated 2 years ago
- Dissecting NVIDIA GPU Architecture☆116Jul 11, 2022Updated 3 years ago
- Custom triton kernels for training Karpathy's nanoGPT.☆19Oct 21, 2024Updated last year
- My submission for the GPUMODE/AMD fp8 mm challenge☆29Jun 4, 2025Updated 9 months ago
- ☆49Mar 14, 2025Updated 11 months ago
- Class of High Performance Computing taken at U.T.P 2017☆111Oct 11, 2017Updated 8 years ago
- ☆50Dec 4, 2023Updated 2 years ago
- ☆44Updated this week
- Enlightener, the cutting-edge Retrieval-Augmented Generation (RAG) system that revolutionizes query responses. By combining the power of …☆14Jul 28, 2025Updated 7 months ago
- A complete CUDA tutorial ranging from first GPU programs to advanced asynchronous methods☆29Jan 22, 2026Updated last month
- ☆68Jun 23, 2025Updated 8 months ago
- Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning☆25May 12, 2025Updated 9 months ago
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆327Updated this week
- Material for gpu-mode lectures☆5,800Feb 1, 2026Updated last month
- Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"☆29Jun 30, 2025Updated 8 months ago
- Android Automotive Testapp☆13Feb 10, 2023Updated 3 years ago
- A collection of study materials for AI compilers and systems.☆55Nov 14, 2025Updated 3 months ago
- GPU programming related news and material links☆2,010Sep 17, 2025Updated 5 months ago
- Materials for learning SGLang☆766Jan 5, 2026Updated 2 months ago
- EquiTriton is a project that seeks to implement high-performance kernels for commonly used building blocks in equivariant neural networks…☆68Dec 16, 2025Updated 2 months ago
- ☆79Nov 26, 2024Updated last year
- Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruct…☆526Sep 8, 2024Updated last year
- Code for Towards More Practical Adversarial Attacks on Graph Neural Networks (NeurIPS 2020)☆28Nov 13, 2021Updated 4 years ago
- Telegram bot for facilitating, accelerating, and automating the reservations at the University of Milan’s libraries, specifically the Bib…☆20Feb 12, 2026Updated 3 weeks ago
- ☆28Dec 3, 2025Updated 3 months ago
- Simulations for light-pulse atom interferometry☆13Feb 23, 2026Updated last week
- AI voice assistant that uses Twilio Voice and ConversationRelay, and the Google Gemini API to engage in two-way conversations over a phon…☆23Feb 19, 2026Updated 2 weeks ago
- A distributed data flow and computation system that runs on transactional messaging infrastructure☆11Oct 22, 2022Updated 3 years ago
- Advanced Python (German)☆10Sep 5, 2023Updated 2 years ago
- 基于老年人互助养老模式的时间银行系统研究(程成)☆10Nov 18, 2014Updated 11 years ago
- 详细双语注释版word2vec源码,well-annotated word2vec☆10Oct 3, 2021Updated 4 years ago
- Data Science Foundations: Python Scientific Stack☆11Jun 2, 2022Updated 3 years ago
- This is a repository for the LinkedIn Learning course GitHub Essential Training: The Basics☆13Aug 1, 2023Updated 2 years ago
- Xiao's CUDA Optimization Guide [NO LONGER ADDING NEW CONTENT]☆323Nov 8, 2022Updated 3 years ago
- Code base and slides for ECE408:Applied Parallel Programming On GPU.☆144Jul 2, 2021Updated 4 years ago