aws-samples / nki-llama
☆12Updated 2 weeks ago
Alternatives and similar repositories for nki-llama:
Users that are interested in nki-llama are comparing it to the libraries listed below
- ☆29Updated this week
- ☆11Updated last week
- ☆52Updated last month
- A schedule language for large model training☆145Updated 9 months ago
- Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning☆23Updated 3 months ago
- MLIR-based partitioning system☆74Updated this week
- ☆73Updated 4 months ago
- ☆23Updated 4 months ago
- ☆44Updated last year
- ☆76Updated 2 years ago
- extensible collectives library in triton☆84Updated 6 months ago
- Cavs: An Efficient Runtime System for Dynamic Neural Networks☆14Updated 4 years ago
- The ASPLOS 2025 / EuroSys 2025 Contest Track☆30Updated last week
- ☆47Updated 2 years ago
- The documents for TVM Unity☆9Updated 7 months ago
- Github mirror of trition-lang/triton repo.☆25Updated this week
- Data-Centric MLIR dialect☆40Updated last year
- ☆35Updated 3 months ago
- KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems☆237Updated last week
- Architecture-level Fault Injection Tool for GPU Application Resilience Evaluation☆58Updated last year
- ☆57Updated 3 months ago
- Microsoft Collective Communication Library☆60Updated 4 months ago
- ☆18Updated 11 months ago
- Fast low-bit matmul kernels in Triton☆267Updated this week
- ☆162Updated 9 months ago
- Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.☆61Updated last week
- ☆90Updated 2 weeks ago
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆189Updated last week
- Sparse kernels for GNNs based on TVM☆16Updated 4 years ago
- ☆37Updated this week