☆234Aug 2, 2024Updated last year
Alternatives and similar repositories for Programming-Massively-Parallel-Processors
Users that are interested in Programming-Massively-Parallel-Processors are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Examples and exercises from the book Programming Massively Parallel Processors - A Hands-on Approach. David B. Kirk and Wen-mei W. Hwu (T…☆79Jan 21, 2021Updated 5 years ago
- Solution of Programming Massively Parallel Processors☆51Jan 15, 2024Updated 2 years ago
- Create cohorts from databases utilizing the OMOP CDM☆10May 19, 2025Updated last year
- CUDA 6大并行计算模式 代码与笔记☆63Jul 30, 2020Updated 5 years ago
- ☆49Apr 15, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Material for gpu-mode lectures☆6,178Updated this week
- Optimized Parallel Tiled Approach to perform Matrix Multiplication by taking advantage of the lower latency, higher bandwidth shared memo…☆17Sep 24, 2017Updated 8 years ago
- ☆14Mar 8, 2025Updated last year
- Dissecting NVIDIA GPU Architecture☆122Jul 11, 2022Updated 3 years ago
- Code base and slides for ECE408:Applied Parallel Programming On GPU.☆148Jul 2, 2021Updated 4 years ago
- ☆10Dec 15, 2023Updated 2 years ago
- Step-by-step optimization of CUDA SGEMM☆475Mar 30, 2022Updated 4 years ago
- Jumpstart your custom DNN accelerator today. This project holds scripts to build and start containers that can compile binaries to the ze…☆10Jun 17, 2020Updated 5 years ago
- Notes on "Programming Massively Parallel Processors" by Hwu, Kirk, and Hajj (4th ed.)☆53Aug 8, 2024Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- BitcoinWallet for learning bitcoin technologies easier(tested on JCIDE and JC30M48)☆17Dec 15, 2016Updated 9 years ago
- CUDA GPU Benchmark☆38Jan 31, 2025Updated last year
- Awesome code, projects, books, etc. related to CUDA☆37Jun 2, 2026Updated last week
- ☆18Jan 4, 2024Updated 2 years ago
- ☆28Oct 11, 2022Updated 3 years ago
- STM32 NFC NXP MFRC630, CLRC663, ISO14443A, ISO14443A-4, ISO7816-4 APDU☆16Mar 23, 2023Updated 3 years ago
- Implementations of Multiple View Geometry in Computer Vision and some extended algorithms.☆11Sep 25, 2021Updated 4 years ago
- This repo is designed for the my cuda course projects☆48Mar 20, 2025Updated last year
- Triton Compiler related materials.☆44Mar 16, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Dronet, adapted for Pytorch.☆11Oct 21, 2025Updated 7 months ago
- Acceleration codes for the Ozaki-scheme on integer matrix multiplication units.☆26Dec 10, 2025Updated 6 months ago
- Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.☆10Aug 19, 2023Updated 2 years ago
- Cleanlab Vizzy: illustrating the core ideas behind the Cleanlab algorithm☆16Apr 19, 2023Updated 3 years ago
- A collection of Topology Methods in Deep Learning☆18Jun 19, 2020Updated 5 years ago
- Performance test of NumPy shared memory module☆14Mar 8, 2016Updated 10 years ago
- CogNetX is an advanced, multimodal neural network architecture inspired by human cognition. It integrates speech, vision, and video proce…☆21Jun 8, 2026Updated last week
- Fast CUDA matrix multiplication from scratch☆1,216Sep 2, 2025Updated 9 months ago
- [ICML 2025] Adaptive Self-improvement LLM Agentic System for ML Library Development☆17Jan 6, 2026Updated 5 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A parser for PTX 6.5☆13Jun 19, 2023Updated 2 years ago
- moved to https://github.com/Zhaoyilunnn/qdao☆10Aug 30, 2023Updated 2 years ago
- [VLDB'24] Blitzcrank is to compress in-memory, OLTP databases. It introduces a new entropy coding algorithm named Delayed Coding.☆40Sep 20, 2024Updated last year
- Load and run Llama from safetensors files in C☆15Oct 24, 2024Updated last year
- D3DSample☆11Apr 22, 2020Updated 6 years ago
- A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.☆487Mar 10, 2025Updated last year
- Dynamic Memory Management for Serving LLMs without PagedAttention☆491Updated this week