making the official triton tutorials actually comprehensible
☆144Aug 25, 2025Updated 7 months ago
Alternatives and similar repositories for triton_docs_tutorials
Users that are interested in triton_docs_tutorials are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆31Mar 31, 2026Updated 2 weeks ago
- A tiny easily hackable implementation of a feature dashboard.☆16Oct 21, 2025Updated 5 months ago
- A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.☆470Mar 10, 2025Updated last year
- ☆249Jan 2, 2025Updated last year
- ☆29Nov 9, 2025Updated 5 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- TensorRT encapsulation, learn, rewrite, practice.☆29Oct 19, 2022Updated 3 years ago
- Notes and code for Programming Massively Parallel Processors☆13Mar 29, 2025Updated last year
- Cataloging released Triton kernels.☆302Sep 9, 2025Updated 7 months ago
- ☆16May 14, 2025Updated 11 months ago
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- Sophgo AI chips driver and runtime library.☆24Updated this week
- Puzzles for learning Triton☆2,374Apr 1, 2026Updated 2 weeks ago
- [ICML 2024 Oral] LSH-Based Efficient Point Transformer (HEPT)☆24Jan 24, 2025Updated last year
- General Matrix Multiplication using NVIDIA Tensor Cores☆28Jan 25, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [ACL'26 Findings] Official code for "BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search"☆23Updated this week
- 使用 cutlass 实现 flash-attention 精简版,具有教学意义☆59Aug 12, 2024Updated last year
- ☆15Jul 18, 2022Updated 3 years ago
- 📚 A curated list of awesome matrix-matrix multiplication (A * B = C) frameworks, libraries and software☆64Feb 23, 2025Updated last year
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆201Jun 1, 2025Updated 10 months ago
- Python package to export NN/RR interval series in KUBIOS HRV readable format and to import HRV results from KUBIOS report files in .txt f…☆12Jan 4, 2019Updated 7 years ago
- llm langchain quick start☆16Jun 14, 2023Updated 2 years ago
- Code for paper https://arxiv.org/abs/2501.00522☆14Apr 28, 2025Updated 11 months ago
- How to quickly serve an LLM using Fast API, Celery, and Redis☆17Aug 29, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.☆19Feb 9, 2026Updated 2 months ago
- Persistent dense gemm for Hopper in `CuTeDSL`☆15Aug 9, 2025Updated 8 months ago
- Large-Vocabulary Continuous Sign Language Recognition, 2024☆16May 30, 2024Updated last year
- ☆17Jul 20, 2025Updated 8 months ago
- This repository contains an exhaustive coverage of a hands on approach to PyTorch along side powerful tools to accelerate model tuning an…☆262Mar 11, 2026Updated last month
- Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!☆254Updated this week
- An MLIR-based compiler from C/C++ to AMD-Xilinx Versal AIE☆17Aug 5, 2022Updated 3 years ago
- ☆16Nov 25, 2022Updated 3 years ago
- ArterialNet reconstructs arterial blood pressure (ABP) waveform☆13Feb 24, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆23Jul 30, 2024Updated last year
- ☆23Apr 7, 2026Updated last week
- ☆148Apr 4, 2026Updated 2 weeks ago
- [ICML‘2024] "LoCoCo: Dropping In Convolutions for Long Context Compression", Ruisi Cai, Yuandong Tian, Zhangyang Wang, Beidi Chen☆17Sep 7, 2024Updated last year
- ☆15Apr 14, 2020Updated 6 years ago
- Implementation of a Transformer, but completely in Triton☆279Apr 5, 2022Updated 4 years ago
- Custom triton kernels for training Karpathy's nanoGPT.☆19Oct 21, 2024Updated last year