This repository documents my 100-day journey of learning and writing CUDA kernels.
☆31Mar 29, 2026Updated 2 months ago
Alternatives and similar repositories for 100-days-cuda
Users that are interested in 100-days-cuda are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Aug 4, 2025Updated 10 months ago
- Developed a high-performance trading engine using Rust, leveraging its powerful features for low-level systems programming. Engineered to…☆23Nov 9, 2024Updated last year
- learning & making kernels in cuda / triton☆22Aug 24, 2025Updated 9 months ago
- ☆32Jun 22, 2025Updated 11 months ago
- ☆24May 26, 2026Updated 2 weeks ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Using FlexAttention to compute attention with different masking patterns☆47Sep 22, 2024Updated last year
- A hands-on introduction to tuning GPU kernels using Kernel Tuner https://github.com/KernelTuner/kernel_tuner/☆37Oct 29, 2025Updated 7 months ago
- This repository is a curated collection of resources, tutorials, and practical examples designed to guide you through the journey of mast…☆453Feb 22, 2025Updated last year
- Minimal TPU implementation with 8x8 systolic array and PyTorch integration☆62Jan 26, 2026Updated 4 months ago
- A Beginner's Guide to Monetizing Your Python AI Chatbot☆16Apr 22, 2025Updated last year
- Load and run Llama from safetensors files in C☆15Oct 24, 2024Updated last year
- RyuseiLight is a beautiful, lightweight and extensible syntax highlighter.☆15Aug 9, 2021Updated 4 years ago
- coding CUDA everyday!☆77Feb 5, 2026Updated 4 months ago
- The Vulkan GPU radix sort implementation from Google Fuchsia, but with CMake☆15Jan 13, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Composition of Multimodal Language Models From Scratch☆15Aug 16, 2024Updated last year
- ☆20May 30, 2026Updated last week
- Implementation of 12 AI agents evaluation techniques☆43Jul 31, 2025Updated 10 months ago
- Header-only skip list library for modern C++ (C++17/C++20)☆18Feb 1, 2022Updated 4 years ago
- A STL-like graph library☆13Sep 5, 2025Updated 9 months ago
- High Performance FP8 GEMM Kernels for SM89 and later GPUs.☆21Jan 24, 2025Updated last year
- EB1A DIY Collection☆16Nov 17, 2025Updated 6 months ago
- ☆150Apr 4, 2026Updated 2 months ago
- Comprehensive option pricing and strategy analyzer☆13Mar 29, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Generate PDF/PNG slides from source code☆12Oct 29, 2024Updated last year
- Tutorial for (PyTorch) + (C++) + (Metal shader)☆16Oct 25, 2025Updated 7 months ago
- 專門為廢土伺服器所製作的存綠寶石Bot☆16Mar 20, 2024Updated 2 years ago
- A comprehensive hands-on project for learning GPU programming with CUDA and HIP, covering fundamental concepts through advanced optimizat…☆36Nov 20, 2025Updated 6 months ago
- A curation of awesome portfolio website ideas for developers and designers to draw inspiration from. Raise a pull request to add more. 💜…☆17Apr 15, 2025Updated last year
- Apply GPU in ML and DL☆68Mar 23, 2026Updated 2 months ago
- Synthetic data generation for evaluating LLM symbolic and logic reasoning☆22Mar 6, 2026Updated 3 months ago
- implement GPT-OSS 20B & 120B C++ inference from scratch on AMD GPUs☆176Oct 25, 2025Updated 7 months ago
- Personal solutions to the Triton Puzzles☆21Jul 18, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A command-line tool for convert SVG image to PDF file☆17Mar 29, 2025Updated last year
- The UNofficial Rust SDK for Model Context Protocol servers and clients☆18Nov 28, 2024Updated last year
- General Matrix Multiplication using NVIDIA Tensor Cores☆28Jan 25, 2025Updated last year
- Learn RL Techniques in 3 Easy Projects☆20Oct 16, 2024Updated last year
- Houdini Python Wiki☆18Mar 18, 2024Updated 2 years ago
- Measuring Thinking Efficiency in Reasoning Models - Research Repository☆40Dec 2, 2025Updated 6 months ago
- Ship correct and fast LLM kernels to PyTorch☆150Jan 14, 2026Updated 4 months ago