dlsyscourse / public_notebooksLinks
☆58Updated 6 months ago
Alternatives and similar repositories for public_notebooks
Users that are interested in public_notebooks are comparing it to the libraries listed below
Sorting:
- ☆34Updated last year
- Examples and exercises from the book Programming Massively Parallel Processors - A Hands-on Approach. David B. Kirk and Wen-mei W. Hwu (T…☆67Updated 4 years ago
- Cataloging released Triton kernels.☆226Updated 4 months ago
- ☆169Updated last year
- ☆87Updated 8 months ago
- ☆157Updated last year
- ring-attention experiments☆143Updated 7 months ago
- ☆72Updated last year
- A minimal implementation of vllm.☆41Updated 10 months ago
- ☆168Updated 5 months ago
- Learning material for CMU10-714: Deep Learning System☆251Updated last year
- Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch☆115Updated last month
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆127Updated this week
- ☆37Updated last year
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆44Updated 10 months ago
- Collection of kernels written in Triton language☆125Updated 2 months ago
- ☆14Updated 2 weeks ago
- ☆67Updated 7 months ago
- Personal solutions to the Triton Puzzles☆18Updated 10 months ago
- ☆93Updated last week
- CUDA and Triton implementations of Flash Attention with SoftmaxN.☆70Updated last year
- Memory Optimizations for Deep Learning (ICML 2023)☆64Updated last year
- Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)☆104Updated 2 years ago
- Course materials for MIT6.5940: TinyML and Efficient Deep Learning Computing☆46Updated 4 months ago
- ☆215Updated this week
- Tutorials for Triton, a language for writing gpu kernels☆18Updated last year
- Experiment of using Tangent to autodiff triton☆79Updated last year
- Custom kernels in Triton language for accelerating LLMs☆20Updated last year
- making the official triton tutorials actually comprehensible☆34Updated 2 months ago
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆105Updated last year