sgl-project/sgl-kernel-npu

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sgl-project/sgl-kernel-npu)

sgl-project / sgl-kernel-npu

SGLang kernel library for NPU

☆170

Alternatives and similar repositories for sgl-kernel-npu

Users that are interested in sgl-kernel-npu are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sgl-project / sgl-kernel-xpu
View on GitHub
SGLang kernel library for Intel XPU
☆27Updated this week
cosdt / vllm-ascend
View on GitHub
See vLLM official support: https://github.com/vllm-project/vllm-ascend
☆11Feb 5, 2025Updated last year
Ascend / triton-ascend
View on GitHub
Triton adapter for Ascend. Mirror of https://gitcode.com/ascend/triton-ascend
☆127May 18, 2026Updated 2 months ago
vllm-project / vllm-ascend
View on GitHub
Community maintained hardware plugin for vLLM on Ascend
☆2,512Updated this week
tile-ai / tilelang-ascend
View on GitHub
Ascend TileLang adapter
☆340Updated this week
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
Ascend / sglang
View on GitHub
SGLang is a high-performance serving framework for large language models and multimodal models.
☆17Updated this week
serdes21 / flashtile
View on GitHub
FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.
☆61Feb 6, 2026Updated 5 months ago
YJMSTR / flash-linear-attention
View on GitHub
FLA but cuTile
☆27Apr 17, 2026Updated 3 months ago
cherichy / tilecute
View on GitHub
☆32Jul 2, 2025Updated last year
omni-ai-npu / omni-infer
View on GitHub
Omni_Infer is a suite of inference accelerators designed for the Ascend NPU platform, offering native support and an expanding feature se…
☆127Updated this week
tile-ai / AttentionEngine
View on GitHub
☆52May 19, 2025Updated last year
sgl-project / sgl-learning-materials
View on GitHub
Materials for learning SGLang
☆861Jan 5, 2026Updated 6 months ago
tile-ai / TileOPs
View on GitHub
High-performance LLM operator library built on TileLang.
☆164Updated this week
DeepLink-org / DLSlime
View on GitHub
Composable and Embeddable Communication Runtime for Distributed AI Services
☆102Jun 5, 2026Updated last month
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
LMCache / LMCache-Ascend
View on GitHub
LMCache on Ascend
☆82Updated this week
hw-native-sys / pto-isa
View on GitHub
PTO instruction set architecture
☆64Updated this week
vllm-project / tml-fa4
View on GitHub
FA4-based Relative Attention Kernel developed by TML and Colfax
☆17Jul 17, 2026Updated last week
shinezyy / deepseek_model
View on GitHub
☆42Oct 12, 2025Updated 9 months ago
sgl-project / sglang-omni
View on GitHub
SGLang Omni: High-Performance Multi-Stage Pipeline Framework for Omni Models
☆706Updated this week
Ascend / triton-ascend-ops
View on GitHub
☆22Jun 29, 2026Updated last month
sgl-project / SpecForge
View on GitHub
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
☆1,018Updated this week
xLLM-AI / xllm
View on GitHub
A high-performance inference engine for LLM, VLM, DiT and REC models, optimized for diverse AI accelerators. It is hosted in OpenAtom Fou…
☆1,496Updated this week
triton-lang / triton-ascend
View on GitHub
Triton language and compiler for Ascend NPU
☆118Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
tile-ai / TileRT
View on GitHub
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
☆1,596Jul 14, 2026Updated 2 weeks ago
flashinfer-ai / cutlass-viz
View on GitHub
☆65Apr 26, 2025Updated last year
toyaix / tritonllm
View on GitHub
LLM Inference via Triton (Flexible & Modular): Focused on Kernel Optimization using CUBIN binaries, Starting from gpt-oss Model
☆119Apr 28, 2026Updated 3 months ago
ROCm / FlyDSL
View on GitHub
FlyDSL is the Python front‑end of the project: a Flexible Layout Python DSL for expressing tiling, partitioning, data movement, and kerne…
☆252Updated this week
wzzll123 / MultiKernelBench
View on GitHub
MultiArchKernelBench: A Multi-Platform Benchmark for Kernel Generation
☆66Jul 8, 2026Updated 3 weeks ago
ByteDance-Seed / Triton-distributed
View on GitHub
Distributed Compiler based on Triton for Parallel Systems
☆1,503Jul 20, 2026Updated last week
abdelfattah-lab / nitro
View on GitHub
Lightweight Python Wrapper for OpenVINO, enabling LLM inference on NPUs
☆29Dec 17, 2024Updated last year
HazyResearch / Megakernels
View on GitHub
Kernels, of the mega variety :)
☆787May 26, 2026Updated 2 months ago
foundation-model-stack / vllm-triton-backend
View on GitHub
A Triton-only attention backend for vLLM
☆27Jul 14, 2026Updated 2 weeks ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
sgl-project / mini-sglang
View on GitHub
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
☆4,643May 17, 2026Updated 2 months ago
sgl-project / sglang-jax
View on GitHub
JAX backend for SGL
☆319Updated this week
kvcache-ai / Mooncake
View on GitHub
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
☆6,063Updated this week
flagos-ai / libtriton_jit
View on GitHub
A Triton JIT runtime and ffi provider in C++
☆37Updated this week
NVIDIA / SOL-ExecBench
View on GitHub
A benchmark of real-world DL kernel problems
☆265Jul 15, 2026Updated 2 weeks ago
tile-ai / tilelang
View on GitHub
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
☆7,007Updated this week
BBuf / AI-Infra-Auto-Driven-SKILLS
View on GitHub
☆699Updated this week