This repository documents my 100-day journey of learning and writing CUDA kernels.
☆30Mar 29, 2026Updated 3 weeks ago
Alternatives and similar repositories for 100-days-cuda
Users that are interested in 100-days-cuda are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A series of high-performance GEMM (General Matrix Multiply) implementations Iteratively optimised for H100 GPUs in Pure CUDA.☆76Feb 18, 2026Updated 2 months ago
- TensorRT☆11Sep 22, 2020Updated 5 years ago
- learning & making kernels in cuda / triton☆22Aug 24, 2025Updated 7 months ago
- Curated list of Moroccans publishing in the most prestigious AI conferences☆11Oct 14, 2024Updated last year
- This project is a versatile and powerful search tool that leverages state-of-the-art natural language processing models to provide releva…☆12Apr 3, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆32Jun 22, 2025Updated 9 months ago
- Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.☆19Feb 9, 2026Updated 2 months ago
- Persistent dense gemm for Hopper in `CuTeDSL`☆15Aug 9, 2025Updated 8 months ago
- Advancing TTP Analysis: Harnessing the Power of Large Language Models with Retrieval Augmented Generation☆11May 14, 2024Updated last year
- LLM query engine to retrieve augmented responses from json files.☆15Oct 12, 2023Updated 2 years ago
- Scripts and outputs for ATLAS data in STIX JSON and ATT&CK Navigator layer formats☆28Mar 31, 2026Updated 2 weeks ago
- ☆23Apr 7, 2026Updated last week
- Using FlexAttention to compute attention with different masking patterns☆47Sep 22, 2024Updated last year
- Training framework for Large Behavioral Models☆28Sep 17, 2025Updated 7 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A straightforward method to reduce your LLM inference API costs and token usage.☆22May 18, 2025Updated 11 months ago
- Minimal TPU implementation with 8x8 systolic array and PyTorch integration☆60Jan 26, 2026Updated 2 months ago
- Fork of rust concurrent hash map bencmarks to include leapfrog map.☆14Mar 13, 2022Updated 4 years ago
- A Beginner's Guide to Monetizing Your Python AI Chatbot☆16Apr 22, 2025Updated 11 months ago
- A collection of GPU experiments and benchmarks for my personal understanding and research.☆28Apr 9, 2026Updated last week
- coding CUDA everyday!☆74Feb 5, 2026Updated 2 months ago
- Composition of Multimodal Language Models From Scratch☆15Aug 16, 2024Updated last year
- Implementation of 12 AI agents evaluation techniques☆39Jul 31, 2025Updated 8 months ago
- The Vulkan GPU radix sort implementation from Google Fuchsia, but with CMake☆13Jan 13, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆18Updated this week
- Contains my solutions for various online judge problems, organized in the worst possible way☆14Jul 25, 2015Updated 10 years ago
- ☆13Oct 9, 2024Updated last year
- High Performance FP8 GEMM Kernels for SM89 and later GPUs.☆21Jan 24, 2025Updated last year
- Python script to automatically create sigma rules from The hive observables☆25Mar 17, 2019Updated 7 years ago
- From a+b to sparsemax(QK^T)V in Triton!☆29Jun 19, 2025Updated 10 months ago
- ☆148Apr 4, 2026Updated 2 weeks ago
- Personal solutions to the Triton Puzzles☆20Jul 18, 2024Updated last year
- Apply GPU in ML and DL☆67Mar 23, 2026Updated 3 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A curation of awesome portfolio website ideas for developers and designers to draw inspiration from. Raise a pull request to add more. 💜…☆17Apr 15, 2025Updated last year
- A comprehensive hands-on project for learning GPU programming with CUDA and HIP, covering fundamental concepts through advanced optimizat…☆35Nov 20, 2025Updated 4 months ago
- Certified robustness of deep neural networks☆19Aug 20, 2024Updated last year
- VNHSGE: Vietnamese High School Graduation Examination Dataset for Large Language Models☆28Jul 24, 2023Updated 2 years ago
- Synthetic data generation for evaluating LLM symbolic and logic reasoning☆22Mar 6, 2026Updated last month
- A Transformer Model Exploiting Histology Images and Spatial Gene Expression☆22Mar 18, 2025Updated last year
- MCP server for Hugging Face dataset viewer☆30Apr 25, 2025Updated 11 months ago