aws-samples / nki-llama

☆13

Alternatives and similar repositories for nki-llama:

Users that are interested in nki-llama are comparing it to the libraries listed below

aws-neuron / nki-samples
☆34Updated last month
awslabs / nki-autotune
☆14Updated this week
awslabs / Lancet-Accelerating-MoE-Training-via-Whole-Graph-Computation-Communication-Overlapping
Official implementation for the paper Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapp…
☆11Updated 7 months ago
aws-neuron / neuronx-distributed
☆55Updated last month
google / iopddl
Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning
☆23Updated 4 months ago
hao-ai-lab / MuxServe
☆59Updated 10 months ago
mlcommons / chakra-old
Repository for MLCommons Chakra schema and tools
☆39Updated last year
asplos-contest / 2025
The ASPLOS 2025 / EuroSys 2025 Contest Track
☆35Updated last week
uclasystem / bamboo
Bamboo is a system for running large pipeline-parallel DNNs affordably, reliably, and efficiently using spot instances.
☆49Updated 2 years ago
mutinifni / splitwise-sim
LLM serving cluster simulator
☆99Updated last year
triton-lang / kernels
☆79Updated 6 months ago
Ying1123 / VTC-artifact
☆29Updated 11 months ago
WukLab / preble
Stateful LLM Serving
☆65Updated last month
suquark / hoplite
☆44Updated 3 years ago
abhibambhaniya / GenZ-LLM-Analyzer
LLM Inference analyzer for different hardware platforms
☆65Updated last week
Hsword / SpotServe
SpotServe: Serving Generative Large Language Models on Preemptible Instances
☆118Updated last year
SymbioticLab / Oobleck
A resilient distributed training framework
☆95Updated last year
argonne-lcf / LLM-Inference-Bench
LLM-Inference-Bench
☆40Updated 4 months ago
cchan / tccl
extensible collectives library in triton
☆86Updated last month
casys-kaist / EnvPipe
☆24Updated last year
parasailteam / coconet
☆79Updated 2 years ago
Azure / msccl
Microsoft Collective Communication Library
☆65Updated 5 months ago
openxla / shardy
MLIR-based partitioning system
☆82Updated this week
merthidayetoglu / HiCCL
A hierarchical collective communications library with portable optimizations
☆35Updated 5 months ago
facebookresearch / param
PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…
☆138Updated this week
facebookresearch / dlrm_datasets
Set of datasets for the deep learning recommendation model (DLRM).
☆45Updated 2 years ago
zhisbug / Cavs
Cavs: An Efficient Runtime System for Dynamic Neural Networks
☆14Updated 4 years ago
saareliad / FTPipe
FTPipe and related pipeline model parallelism research.
☆41Updated last year
aws-neuron / upstreaming-to-vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆15Updated last week
Raphael-Hao / brainstorm
Compiler for Dynamic Neural Networks
☆46Updated last year