A comprehensive hands-on project for learning GPU programming with CUDA and HIP, covering fundamental concepts through advanced optimization techniques.
☆35Nov 20, 2025Updated 3 months ago
Alternatives and similar repositories for gpu-programming-101
Users that are interested in gpu-programming-101 are comparing it to the libraries listed below
Sorting:
- Minimal TPU implementation with 8x8 systolic array and PyTorch integration☆40Jan 26, 2026Updated last month
- Code for ICML 2025 paper | Joint Localization and Activation Editing for Low-Resource Fine-Tuning☆26Jun 18, 2025Updated 8 months ago
- An Ecomm app built on PHP & MySQL☆10Oct 27, 2022Updated 3 years ago
- Code implementation for paper AbsenceBench: Language Models Can't Tell What's Missing☆17Oct 23, 2025Updated 4 months ago
- Exploring how optimizations for GEMMs work☆28Jan 1, 2026Updated last month
- An interactive Rust learning platform featuring progressive exercises aligned with "The Rust Programming Language" book.☆20Dec 8, 2025Updated 2 months ago
- ☆11Mar 14, 2023Updated 2 years ago
- A curation of awesome portfolio website ideas for developers and designers to draw inspiration from. Raise a pull request to add more. 💜…☆15Apr 15, 2025Updated 10 months ago
- Minimal implementation of a Byte Pair Encoding (BPE) tokenizer in Zig☆14Apr 7, 2025Updated 10 months ago
- Integration of Pydantic with Kedro.☆12Aug 5, 2024Updated last year
- API for Asset Service☆15Aug 15, 2024Updated last year
- ☆11Aug 10, 2021Updated 4 years ago
- ☆17Jun 18, 2025Updated 8 months ago
- From a+b to sparsemax(QK^T)V in Triton!☆28Jun 19, 2025Updated 8 months ago
- Orpheus TTS Server with streaming support (TTFB ~160ms)☆23Sep 21, 2025Updated 5 months ago
- OSEP - Offsec Expert Professional☆21Jun 23, 2024Updated last year
- A straightforward method to reduce your LLM inference API costs and token usage.☆21May 18, 2025Updated 9 months ago
- ☆19Mar 3, 2025Updated 11 months ago
- Demo project to experiment with live video streaming, nodejs and kubernetes☆15Jun 13, 2018Updated 7 years ago
- Learn RL Techniques in 3 Easy Projects☆17Oct 16, 2024Updated last year
- Learning Robot Geometry as Distance Fields: Applications to Whole-body Manipulation☆20Sep 4, 2024Updated last year
- Fetch & Filter Known URLs☆15Aug 3, 2022Updated 3 years ago
- A Transformer Model Exploiting Histology Images and Spatial Gene Expression☆22Mar 18, 2025Updated 11 months ago
- implement GPT-OSS 20B & 120B C++ inference from scratch on AMD GPUs☆169Oct 25, 2025Updated 4 months ago
- [ICLR 2026] Official PyTorch implementation for "ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding"☆57Dec 26, 2025Updated 2 months ago
- Code repository for the Go Cookbook☆15Jan 28, 2022Updated 4 years ago
- WordPress Elementor 3.6.0 3.6.1 3.6.2 RCE POC☆16Apr 17, 2022Updated 3 years ago
- A series of high-performance GEMM (General Matrix Multiply) implementations Iteratively optimised for H100 GPUs in Pure CUDA.☆70Feb 18, 2026Updated last week
- Umbrella will protect your shellcode from the rain.☆31Jun 4, 2025Updated 8 months ago
- output burp body only and auto pretiffy☆20May 1, 2025Updated 9 months ago
- Various LLM Benchmarks☆24Feb 20, 2026Updated last week
- ☆20Dec 14, 2024Updated last year
- Tactile simulation extensions for Isaac Sim☆23Feb 9, 2026Updated 2 weeks ago
- Lightweight Python Wrapper for OpenVINO, enabling LLM inference on NPUs☆27Dec 17, 2024Updated last year
- Windows Privilege Escalation☆23Jun 7, 2022Updated 3 years ago
- A Kedro Plugin for Databricks☆23Dec 11, 2025Updated 2 months ago
- 3D Navier-Stokes Local Discontinuous Galerkin Solver☆19Sep 7, 2018Updated 7 years ago
- Open source implementation of "CppFlow: Generative Inverse Kinematics for Efficient and Robust Cartesian Path Planning" (ICRA 2024)☆24Nov 14, 2025Updated 3 months ago
- This repository documents my 100-day journey of learning and writing CUDA kernels.☆22Jun 25, 2025Updated 8 months ago