aws-neuron / nki-samplesLinks
☆51Updated 3 weeks ago
Alternatives and similar repositories for nki-samples
Users that are interested in nki-samples are comparing it to the libraries listed below
Sorting:
- ☆60Updated last month
- Project showing how to develop NKI kernels for Llama 3.2 1B inference☆19Updated 4 months ago
- A schedule language for large model training☆151Updated 2 months ago
- ☆39Updated 10 months ago
- extensible collectives library in triton☆90Updated 6 months ago
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆215Updated last week
- How to ensure correctness and ship LLM generated kernels in PyTorch☆107Updated this week
- This repository contains companion software for the Colfax Research paper "Categorical Foundations for CuTe Layouts".☆69Updated last month
- ☆15Updated this week
- MLIR-based partitioning system☆139Updated this week
- TPU inference for vLLM, with unified JAX and PyTorch support.☆123Updated this week
- ☆112Updated last year
- ☆92Updated 11 months ago
- Github mirror of trition-lang/triton repo.☆86Updated last week
- ☆242Updated this week
- Example code for AWS Neuron SDK developers building inference and training applications☆149Updated last week
- TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators☆87Updated 4 months ago
- Collection of kernels written in Triton language☆157Updated 6 months ago
- Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.☆111Updated last week
- PyTorch bindings for CUTLASS grouped GEMM.☆125Updated 4 months ago
- ☆141Updated 9 months ago
- ☆110Updated 9 months ago
- Fast low-bit matmul kernels in Triton☆385Updated this week
- ☆28Updated 9 months ago
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆264Updated this week
- ☆23Updated 2 months ago
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.☆491Updated this week
- Applied AI experiments and examples for PyTorch☆299Updated 2 months ago
- Cataloging released Triton kernels.☆263Updated last month
- (NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.☆41Updated 2 years ago