SwekeR-463 / 100kernelsLinks

100 days of learning & making kernels in cuda / triton

☆22

Alternatives and similar repositories for 100kernels

Users that are interested in 100kernels are comparing it to the libraries listed below

Sorting:

kabir2505 / tiny-mixtral
☆39Updated last month
MekkCyber / TritonAcademy
A repository to unravel the language of GPUs, making their kernel conversations easy to understand
☆184Updated last week
cloneofsimo / ptx-tutorial-by-aislop
PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)
☆67Updated 2 months ago
naklecha / llm-inference-optimizations-explained
in this repository, i'm going to implement increasingly complex llm inference optimizations
☆58Updated 2 weeks ago
kmohan321 / Research_Papers
☆46Updated 2 months ago
1y33 / 100Days
GPU Kernels
☆178Updated last month
apoorvnandan / lilgrad
pytorch from scratch in pure C/CUDA and python
☆40Updated 7 months ago
Maharshi-Pandya / cudacodes
Learnings and programs related to CUDA
☆406Updated 3 months ago
loganwatchorn / notes-pmpp
Notes on "Programming Massively Parallel Processors" by Hwu, Kirk, and Hajj (4th ed.)
☆53Updated 9 months ago
unixpickle / learn-ptx
Learning about CUDA by writing PTX code.
☆131Updated last year
joey00072 / Tinytorch
A really tiny autograd engine
☆94Updated last week
silvaxxx1 / MyLLM101
"LLM from Zero to Hero: An End-to-End Large Language Model Journey from Data to Application!"
☆30Updated last month
hkproj / multi-latent-attention
☆35Updated last week
JINO-ROHIT / advanced_ml
☆45Updated 3 weeks ago
SwayamInSync / pytorch-cpp-cuda-starter
Setting up Vscode to work with Pytorch in C/C++ with CUDA support
☆25Updated 4 months ago
AmeyaWagh / llama2.cpp
Inference Llama 2 in C++
☆43Updated last year
evintunador / triton_docs_tutorials
making the official triton tutorials actually comprehensible
☆34Updated 2 months ago
kanpuriyanawab / minbpe.c
a Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization in pure C.
☆21Updated 10 months ago
tokenbender / avataRL
rl from zero pretrain, can it be done? we'll see.
☆24Updated this week
smolorg / smolgrad
small auto-grad engine inspired from Karpathy's micrograd and PyTorch
☆268Updated 6 months ago
aryagxr / cuda
coding CUDA everyday!
☆31Updated last month
kanpuriyanawab / picograd
Rust Implementation of micrograd
☆51Updated 11 months ago
hkproj / 100-days-of-gpu
☆328Updated last month
0xD4rky / Vision-Transformers
This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementa…
☆218Updated 5 months ago
victor-explore / AI-Q-Papers-IISC-Banglore
Question paper of courses taught at IISC as part of MTech AI curriculum
☆65Updated 6 months ago
KhawajaAbaid / micrograd_c
Andrej Kapathy's micrograd implemented in c
☆28Updated 9 months ago
krupadav3 / Encoder-Block-in-CUDA
Here's all my Python/Numba (CUDA) code for the encoder block I made :)
☆63Updated last month
anyscale / e2e-llm-workflows
Fine-tune an LLM to perform batch inference and online serving.
☆111Updated last week
atullchaurasia / transformers
Transformers from scratch using PyTorch & NumPy.
☆24Updated 3 months ago
MekkCyber / CutlassAcademy
A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS
☆181Updated 3 weeks ago