1y33 / 100DaysLinks
GPU Kernels
☆178Updated last month
Alternatives and similar repositories for 100Days
Users that are interested in 100Days are comparing it to the libraries listed below
Sorting:
- 100 days of building GPU kernels!☆430Updated last month
- ☆328Updated last month
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆184Updated this week
- A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.☆357Updated 2 months ago
- This repository is a curated collection of resources, tutorials, and practical examples designed to guide you through the journey of mast…☆348Updated 3 months ago
- ☆168Updated 5 months ago
- "LLM from Zero to Hero: An End-to-End Large Language Model Journey from Data to Application!"☆29Updated last month
- Learnings and programs related to CUDA☆402Updated 3 months ago
- A repository consisting of paper/architecture replications of classic/SOTA AI/ML papers in pytorch☆196Updated last month
- repo of paper implementations☆19Updated 3 months ago
- Question paper of courses taught at IISC as part of MTech AI curriculum☆65Updated 6 months ago
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS☆181Updated 3 weeks ago
- making the official triton tutorials actually comprehensible☆34Updated 2 months ago
- ☆35Updated last week
- This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementa…☆218Updated 5 months ago
- small auto-grad engine inspired from Karpathy's micrograd and PyTorch☆268Updated 6 months ago
- ☆255Updated 4 months ago
- Assignments of courses taught at IISC as part of MTech AI curriculum☆116Updated 3 months ago
- Slides, notes, and materials for the workshop☆326Updated last year
- ☆39Updated 3 weeks ago
- ☆89Updated last month
- Distributed training (multi-node) of a Transformer model☆67Updated last year
- Apply GPU in ML and DL☆52Updated 3 months ago
- Basically a repo containing architectures/algorithms/papers from scratch in pytorch☆20Updated last month
- ☆46Updated 2 months ago
- Here's all my Python/Numba (CUDA) code for the encoder block I made :)☆63Updated last month
- Accelerated General (FP32) Matrix Multiplication from scratch in CUDA☆116Updated 4 months ago
- Challenging myself to learn CUDA (Basics → Intermediate) these 100 days.☆23Updated 3 weeks ago
- coding CUDA everyday!☆31Updated last month
- RAGs: Simple implementations of Retrieval Augmented Generation (RAG) Systems☆105Updated 4 months ago