☆61Dec 18, 2024Updated last year
Alternatives and similar repositories for xetla
Users that are interested in xetla are comparing it to the libraries listed below
Sorting:
- SYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs☆67Feb 21, 2026Updated last week
- OpenAI Triton backend for Intel® GPUs☆228Feb 21, 2026Updated last week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆65Jun 30, 2025Updated 8 months ago
- ☆57Nov 18, 2025Updated 3 months ago
- Intel® Tensor Processing Primitives extension for Pytorch*☆18Updated this week
- OpenVINO LLM Benchmark☆11Dec 7, 2023Updated 2 years ago
- ☆15Oct 20, 2020Updated 5 years ago
- ☆153Updated this week
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆148Updated this week
- SYCL implementation of Fused MLPs for Intel GPUs☆51Nov 24, 2025Updated 3 months ago
- ☆153Feb 7, 2026Updated 3 weeks ago
- ☆20Jan 29, 2026Updated last month
- Profiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysi…☆261Updated this week
- ☆17Feb 3, 2026Updated 3 weeks ago
- ☆20Mar 27, 2023Updated 2 years ago
- Training examples for SYCL☆49Nov 16, 2025Updated 3 months ago
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆260Jan 13, 2025Updated last year
- portDNN is a library implementing neural network algorithms written using SYCL☆114May 21, 2024Updated last year
- ☆692Updated this week
- Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver☆1,346Updated this week
- oneAPI Level Zero Conformance & Performance test content☆60Updated this week
- oneAPI Level Zero Specification Headers and Loader☆311Updated this week
- Fast GPU based tensor core reductions☆13Jan 13, 2023Updated 3 years ago
- Yaksa: High-performance Noncontiguous Data Management☆15Oct 1, 2025Updated 5 months ago
- DeskVOX is a real-time visualization tool for 3D data sets like image stacks from CT or MRI scanners, or confocal microscopes. It has an …☆21Jan 22, 2026Updated last month
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆14Jan 8, 2026Updated last month
- parser script to process pytorch autograd profiler result, convert json file to excel.☆14Oct 8, 2019Updated 6 years ago
- oneCCL Bindings for Pytorch* (deprecated)☆105Dec 31, 2025Updated 2 months ago
- Source code for 'Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL' by James Reinders, Ben A…☆285Mar 26, 2025Updated 11 months ago
- Teaching Vectorization and SIMD using Intel Intrinsics in a Computer Organization and Architecture class☆17Feb 18, 2025Updated last year
- TPP experimentation on MLIR for linear algebra☆146Updated this week
- Python SYCL bindings and SYCL-based Python Array API library☆121Updated this week
- ☆19Dec 4, 2025Updated 2 months ago
- ☆24Oct 9, 2025Updated 4 months ago
- ALCF Systems User Documentation☆29Feb 21, 2026Updated last week
- oneAPI Collective Communications Library (oneCCL)☆256Feb 4, 2026Updated 3 weeks ago
- Third party assembler and GEMM library for NVIDIA Kepler GPU☆84Oct 8, 2019Updated 6 years ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆49Aug 18, 2025Updated 6 months ago
- ☆59Feb 5, 2026Updated 3 weeks ago