yzlnew / infra-skillsView external linksLinks
A collection of specialized agent skills for AI infrastructure development, enabling Claude Code to write, optimize, and debug high-performance systems.
☆57Feb 2, 2026Updated 2 weeks ago
Alternatives and similar repositories for infra-skills
Users that are interested in infra-skills are comparing it to the libraries listed below
Sorting:
- Persistent dense gemm for Hopper in `CuTeDSL`☆15Aug 9, 2025Updated 6 months ago
- [NeurIPS 2025] ClusterFusion: Expanding Operator Fusion Scope for LLM Inference via Cluster-Level Collective Primitive☆66Dec 11, 2025Updated 2 months ago
- Boosting GPU utilization for LLM serving via dynamic spatial-temporal prefill & decode orchestration☆33Jan 8, 2026Updated last month
- ☆35Mar 7, 2025Updated 11 months ago
- Pipeline Parallelism Emulation and Visualization☆79Jan 8, 2026Updated last month
- ☆19Aug 20, 2025Updated 5 months ago
- 清华大学电子系科协学培部Sast Tutor共享仓库☆14Apr 27, 2022Updated 3 years ago
- DeeperGEMM: crazy optimized version☆74May 5, 2025Updated 9 months ago
- High Performance FP8 GEMM Kernels for SM89 and later GPUs.☆20Jan 24, 2025Updated last year
- ☆41Oct 15, 2025Updated 4 months ago
- ☆38Aug 7, 2025Updated 6 months ago
- ☆45Feb 5, 2026Updated last week
- Learning TileLang with 10 puzzles!☆132Jan 30, 2026Updated 2 weeks ago
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Jul 21, 2023Updated 2 years ago
- FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.☆48Feb 6, 2026Updated last week
- A Survey of Efficient Attention Methods: Hardware-efficient, Sparse, Compact, and Linear Attention☆281Dec 1, 2025Updated 2 months ago
- ☆65Apr 26, 2025Updated 9 months ago
- Tile-based language built for AI computation across all scales☆123Updated this week
- ☆32Jul 29, 2025Updated 6 months ago
- From Minimal GEMM to Everything☆139Feb 10, 2026Updated last week
- Simple Linux Filesystem designed for learning purposes☆32May 22, 2018Updated 7 years ago
- ☆165Feb 5, 2026Updated last week
- ☆177May 7, 2025Updated 9 months ago
- ☆40Dec 31, 2021Updated 4 years ago
- 🌈 Solutions of LeetGPU☆72Feb 4, 2026Updated last week
- HierCGRA: An Open-Source Framework for Large-Scale CGRA with Hierarchical Modeling and Automated Exploration☆14Mar 6, 2023Updated 2 years ago
- ☆10Mar 19, 2025Updated 10 months ago
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 4 months ago
- Unofficial description of the CUDA assembly (SASS) instruction sets.☆201Jul 18, 2025Updated 6 months ago
- ☆114May 16, 2025Updated 9 months ago
- ☆57May 21, 2025Updated 8 months ago
- ☆118May 19, 2025Updated 8 months ago
- EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation☆27Jul 30, 2025Updated 6 months ago
- LaTex template for ITMO style presentations☆10Jan 19, 2025Updated last year
- My solution code to parallel architecture and programming Spring 2016☆12Aug 15, 2016Updated 9 years ago
- An Tensorflow.keras implementation of Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorizatio…☆10Dec 18, 2019Updated 6 years ago
- ☆10Jun 4, 2024Updated last year
- a simple API to use CUPTI☆11Aug 19, 2025Updated 5 months ago
- ☆18May 24, 2025Updated 8 months ago