mlops-discord / gpu-optimization-workshopLinks
Slides, notes, and materials for the workshop
☆326Updated last year
Alternatives and similar repositories for gpu-optimization-workshop
Users that are interested in gpu-optimization-workshop are comparing it to the libraries listed below
Sorting:
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆185Updated 3 weeks ago
- GPU Kernels☆182Updated last month
- 100 days of building GPU kernels!☆445Updated last month
- ☆74Updated last year
- ☆159Updated last year
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS☆189Updated last month
- A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.☆364Updated 3 months ago
- Some CUDA example code with READMEs.☆165Updated 3 months ago
- An implementation of the transformer architecture onto an Nvidia CUDA kernel☆185Updated last year
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.☆134Updated last year
- An ML Systems Onboarding list☆816Updated 5 months ago
- ☆174Updated 5 months ago
- Building blocks for foundation models.☆511Updated last year
- Where GPUs get cooked 👩🍳🔥☆234Updated 3 months ago
- ☆343Updated 2 months ago
- UNet diffusion model in pure CUDA☆608Updated 11 months ago
- Cataloging released Triton kernels.☆238Updated 5 months ago
- PyTorch Single Controller☆218Updated this week
- Fine-tune an LLM to perform batch inference and online serving.☆112Updated 3 weeks ago
- Alex Krizhevsky's original code from Google Code☆192Updated 9 years ago
- Best practices & guides on how to write distributed pytorch training code☆441Updated 4 months ago
- PyTorch per step fault tolerance (actively under development)☆329Updated this week
- Notes from the Latent Space paper club. Follow along or start your own!☆234Updated 10 months ago
- GPU programming related news and material links☆1,590Updated 5 months ago
- ☆504Updated 11 months ago
- ☆219Updated this week
- CUDA tutorials for Maths & ML tutorials with examples, covers multi-gpus, fused attention, winograd convolution, reinforcement learning.☆182Updated 2 weeks ago
- Tutorial Materials for "The Fundamentals of Modern Deep Learning with PyTorch" workshop at PyCon 2024☆244Updated last year
- Contains hands-on example code for [O'reilly book "Deep Learning At Scale"](https://www.oreilly.com/library/view/deep-learning-at/9781098…☆26Updated last year
- Fast low-bit matmul kernels in Triton☆322Updated last week