vllm-project/vllm-xpu-kernels

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/vllm-project/vllm-xpu-kernels)

vllm-project / vllm-xpu-kernels

The vLLM XPU kernels for Intel GPU

☆55

Alternatives and similar repositories for vllm-xpu-kernels

Users that are interested in vllm-xpu-kernels are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

intel / torch-xpu-ops
View on GitHub
☆97Updated this week
vllm-project / vllm-gaudi
View on GitHub
Community maintained hardware plugin for vLLM on Intel Gaudi
☆49Updated this week
intel / intel-extension-for-deepspeed
View on GitHub
Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…
☆65May 27, 2026Updated last month
HabanaAI / vllm-fork
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆89Jul 13, 2026Updated last week
sgl-project / sgl-kernel-xpu
View on GitHub
SGLang kernel library for Intel XPU
☆27Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
vllm-project / vllm-skills
View on GitHub
Agent skills for vLLM
☆89Apr 3, 2026Updated 3 months ago
HabanaAI / gaudi-pytorch-bridge
View on GitHub
☆18Jul 13, 2026Updated last week
intel / sycl-tla
View on GitHub
SYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs
☆76Updated this week
vllm-project / vllm-daily
View on GitHub
vLLM Daily Summarization of Merged PRs
☆51Updated this week
SearchSavior / OpenArc
View on GitHub
Inference engine for Intel devices. Serve LLMs, VLMs, Whisper, Kokoro-TTS, Embedding and Rerank models over OpenAI endpoints.
☆486Updated this week
datafusion-contrib / sqlbench-h
View on GitHub
SQL Benchmark derived from TPC-H
☆11May 20, 2023Updated 3 years ago
openvinotoolkit / npu_compiler
View on GitHub
OpenVINO Intel NPU Compiler
☆92Updated this week
gau-nernst / quantized-training
View on GitHub
Explore training for quantized models
☆26Jul 12, 2025Updated last year
intel / intel-xpu-backend-for-triton
View on GitHub
OpenAI Triton backend for Intel® GPUs
☆258Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
xlite-dev / qwen-image-fast
View on GitHub
⚡️Qwen-Image 4.8x🎉 speedup with Hybrid Acceleration for low VRAM GPUs
☆17Oct 24, 2025Updated 8 months ago
Bruce-Lee-LY / cutlass_gemm
View on GitHub
Multiple GEMM operators are constructed with cutlass to support LLM inference.
☆20Aug 3, 2025Updated 11 months ago
xlite-dev / longcat-video-fast
View on GitHub
🔥LongCat-Video 1.7x🎉 speedup: cache acceleration and 4/8-bits weight only.
☆15Oct 28, 2025Updated 8 months ago
c4pt0r / awesome-tidb
View on GitHub
✨A list of awesome TiDB patterns, code templates, demos✨
☆20Jun 27, 2022Updated 4 years ago
GameTechDev / XeSS-VALAR-Demo
View on GitHub
Mini-Engine Demonstration of Combining XeSS with VRS Tier 2.
☆14Jan 26, 2026Updated 5 months ago
substrait-io / substrait-cpp
View on GitHub
☆17Apr 10, 2026Updated 3 months ago
wafer-ai / kernel-arena
View on GitHub
Public benchmark results from Kernel Arena, a leaderboard for LLM-generated AI accelerator kernels.
☆20Mar 11, 2026Updated 4 months ago
vllm-project / vllm-project.github.io
View on GitHub
☆55Updated this week
serdes21 / flashtile
View on GitHub
FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.
☆61Feb 6, 2026Updated 5 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
intel / intel-xai-tools
View on GitHub
Explainable AI Tooling (XAI). XAI is used to discover and explain a model's prediction in a way that is interpretable to the user. Releva…
☆39Sep 22, 2025Updated 9 months ago
NonvolatileMemory / flash_tree_attn
View on GitHub
☆20Dec 24, 2024Updated last year
kubernetes / dynamic-resource-allocation
View on GitHub
☆51Updated this week
abdelfattah-lab / nitro
View on GitHub
Lightweight Python Wrapper for OpenVINO, enabling LLM inference on NPUs
☆29Dec 17, 2024Updated last year
xlite-dev / flux-faster
View on GitHub
A forked version of flux-fast that makes flux-fast even faster with cache-dit, 3.3x speedup on NVIDIA L20.
☆24Jul 18, 2025Updated last year
pzhao-eng / FlashMLA
View on GitHub
☆66Feb 15, 2026Updated 5 months ago
intel / level-zero-npu-extensions
View on GitHub
☆17Updated this week
cloudflare / docs-examples
View on GitHub
Examples surfaced in the Cloudflare Docs
☆17Jun 24, 2026Updated 3 weeks ago
Infrasys-AI / aiinfra-docs
View on GitHub
☆21Nov 6, 2025Updated 8 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
kiloGrand / kuiperinfer
View on GitHub
自制基于C++的深度学习前向推理框架
☆22Jun 4, 2023Updated 3 years ago
triple-mu / Qwen-Image-TensorRT
View on GitHub
Qwen-Image's DiT inference with TensorRT-10
☆21Oct 13, 2025Updated 9 months ago
quic / software-kit-for-qualcomm-cloud-ai-100
View on GitHub
Software kit for Qualcomm Cloud AI 100
☆19Dec 15, 2025Updated 7 months ago
HabanaAI / DeepSpeed
View on GitHub
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
☆14Jan 8, 2026Updated 6 months ago
Aimol-l / Genllm
View on GitHub
基于 C++23 的模块化 LLM 推理框架，原生解析 GGUF 格式。
☆39Jun 19, 2026Updated last month
GameTechDev / VALAR
View on GitHub
Velocity And Luminance Adaptive Rasterization
☆16Mar 31, 2023Updated 3 years ago
onecodex / nim-bitarray
View on GitHub
Bitarray implementation in Nim
☆10Dec 14, 2020Updated 5 years ago