AlphaGPU / leetgpu-challengesLinks
LeetGPU Challenges
☆34Updated this week
Alternatives and similar repositories for leetgpu-challenges
Users that are interested in leetgpu-challenges are comparing it to the libraries listed below
Sorting:
- CUDA Matrix Multiplication Optimization☆214Updated last year
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS☆208Updated 3 months ago
- QuickReduce is a performant all-reduce library designed for AMD ROCm that supports inline compression.☆31Updated 4 months ago
- ☆111Updated 4 months ago
- NVIDIA tools guide☆144Updated 7 months ago
- Applied AI experiments and examples for PyTorch☆289Updated 2 months ago
- Fastest kernels written from scratch☆311Updated 4 months ago
- ☆228Updated this week
- Fast low-bit matmul kernels in Triton☆341Updated this week
- ☆48Updated 7 months ago
- An experimental CPU backend for Triton☆139Updated 2 months ago
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆347Updated this week
- Cataloging released Triton kernels.☆251Updated 7 months ago
- Training material for Nsight developer tools☆163Updated last year
- ☆129Updated 3 months ago
- Perplexity GPU Kernels☆425Updated last week
- ☆175Updated last year
- Evaluating Large Language Models for CUDA Code Generation ComputeEval is a framework designed to generate and evaluate CUDA code from Lar…☆58Updated last month
- CUTLASS and CuTe Examples☆68Updated 3 weeks ago
- Fast CUDA matrix multiplication from scratch☆794Updated last year
- A plugin for Jupyter Notebook to run CUDA C/C++ code☆238Updated 11 months ago
- ☆229Updated last year
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.☆223Updated this week
- Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O☆398Updated 2 months ago
- ☆171Updated 2 years ago
- ☆106Updated 7 months ago
- Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!☆71Updated this week
- Step-by-step optimization of CUDA SGEMM☆363Updated 3 years ago
- kernels, of the mega variety☆471Updated 2 months ago
- 📚 A curated list of awesome matrix-matrix multiplication (A * B = C) frameworks, libraries and software☆51Updated 5 months ago