OpenAI Triton backend for Intel® GPUs
☆232Mar 7, 2026Updated this week
Alternatives and similar repositories for intel-xpu-backend-for-triton
Users that are interested in intel-xpu-backend-for-triton are comparing it to the libraries listed below
Sorting:
- ☆60Dec 18, 2024Updated last year
- Shared Middle-Layer for Triton Compilation☆331Dec 5, 2025Updated 3 months ago
- SYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs☆68Updated this week
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆49Aug 18, 2025Updated 6 months ago
- An experimental CPU backend for Triton☆181Feb 25, 2026Updated last week
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆148Updated this week
- Development repository for the Triton-Linalg conversion☆215Feb 7, 2025Updated last year
- TPP experimentation on MLIR for linear algebra☆146Feb 24, 2026Updated 2 weeks ago
- ☆78Updated this week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆65Jun 30, 2025Updated 8 months ago
- ☆301Updated this week
- My study note for mlsys☆14Nov 4, 2024Updated last year
- A Python package for extending the official PyTorch that can easily obtain performance on Intel platform☆2,013Feb 13, 2026Updated 3 weeks ago
- Profiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysi…☆263Feb 23, 2026Updated 2 weeks ago
- ☆283Updated this week
- oneAPI Collective Communications Library (oneCCL)☆256Feb 4, 2026Updated last month
- FlagGems is an operator library for large language models implemented in the Triton Language.☆909Mar 3, 2026Updated last week
- oneAPI Level Zero Specification Headers and Loader☆311Feb 24, 2026Updated 2 weeks ago
- The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.☆1,760Updated this week
- Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure☆981Updated this week
- ☆693Updated this week
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆329Updated this week
- Generate Linux Perf event tables for Apple Silicon☆17Dec 16, 2025Updated 2 months ago
- Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.☆1,438Updated this week
- jax-triton contains integrations between JAX and OpenAI Triton☆439Feb 27, 2026Updated last week
- Intel® NPU Acceleration Library☆709Apr 24, 2025Updated 10 months ago
- Automatic differentiation for Triton Kernels☆29Aug 12, 2025Updated 6 months ago
- C/C++ frontend for MLIR. Also features polyhedral optimizations, parallel optimizations, and more!☆605Jun 19, 2025Updated 8 months ago
- A suite of tools for pretty printing, diffing, and exploring abstract syntax trees.☆15Mar 3, 2026Updated last week
- IREE's PyTorch Frontend, based on Torch Dynamo.☆105Mar 3, 2026Updated last week
- Framework to reduce autotune overhead to zero for well known deployments.☆97Sep 19, 2025Updated 5 months ago
- ☆59Feb 5, 2026Updated last month
- Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver☆1,349Updated this week
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆107Jun 28, 2025Updated 8 months ago
- ☆21Mar 3, 2025Updated last year
- Reference models for Intel(R) Gaudi(R) AI Accelerator☆170Jan 8, 2026Updated 2 months ago
- Collection of kernels written in Triton language☆181Jan 27, 2026Updated last month
- TORCH_TRACE parser for PT2☆78Feb 26, 2026Updated last week
- An MLIR-based compiler framework bridges DSLs (domain-specific languages) to DSAs (domain-specific architectures).☆696Updated this week