☆97Jun 30, 2026Updated this week
Alternatives and similar repositories for torch-xpu-ops
Users that are interested in torch-xpu-ops are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs☆76Jun 26, 2026Updated last week
- KFunca: A minimalist, high-performance GPU-based automatic differentiation framework☆31Aug 14, 2025Updated 10 months ago
- OpenAI Triton backend for Intel® GPUs☆257Jun 26, 2026Updated last week
- ☆61Mar 6, 2026Updated 3 months ago
- A Python package for extending the official PyTorch that can easily obtain performance on Intel platform☆2,014Mar 30, 2026Updated 3 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- oneAPI - Data Parallel C++ course for students☆44Nov 4, 2024Updated last year
- Profiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysi…☆271Jun 25, 2026Updated last week
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆17Mar 11, 2026Updated 3 months ago
- The repository contains a reference end-to-end pipeline for a real-time video analytics application. Realtime data is provided to an infe…☆12Nov 3, 2025Updated 8 months ago
- Cosmic Tagging Network for Neutrino Physics☆13Jun 26, 2024Updated 2 years ago
- Expert Specialization MoE Solution based on CUTLASS☆27Apr 14, 2026Updated 2 months ago
- This repository hosts code that supports the testing infrastructure for the PyTorch organization. For example, this repo hosts the logic …☆111Jun 26, 2026Updated last week
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆146Jun 26, 2026Updated last week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆88Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Collection of small examples for running on ALCF resources☆24Jun 26, 2026Updated last week
- Intel® Optimization for Chainer*, a Chainer module providing numpy like API and DNN acceleration using MKL-DNN.☆178Jun 8, 2026Updated 3 weeks ago
- Cute layout visualization☆41Jan 18, 2026Updated 5 months ago
- CPU and GPU tutorial examples☆13Apr 4, 2025Updated last year
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆40Jul 31, 2025Updated 11 months ago
- A tracing infrastructure for heterogeneous computing applications.☆41Jun 24, 2026Updated last week
- ☆707Updated this week
- ☆117May 10, 2026Updated last month
- ☆289Updated this week
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver☆1,412Updated this week
- ☆13Aug 28, 2025Updated 10 months ago
- ☆13Apr 24, 2025Updated last year
- ☆23Mar 16, 2026Updated 3 months ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆48Aug 18, 2025Updated 10 months ago
- Yaksa: High-performance Noncontiguous Data Management☆17Oct 1, 2025Updated 9 months ago
- A SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessly optimized for CPU/XPU/CUDA, with multi-datatype support…☆1,503Updated this week
- Wave: Python Domain-Specific Language for High Performance Machine Learning☆58Jun 8, 2026Updated 3 weeks ago
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆152Updated this week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆17Jun 25, 2026Updated last week
- Mini-Engine Demonstration of Combining XeSS with VRS Tier 2.☆14Jan 26, 2026Updated 5 months ago
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆29Jun 24, 2026Updated last week
- ☆20May 30, 2026Updated last month
- Intel® AI Reference Models: contains Intel optimizations for running deep learning workloads on Intel® Xeon® Scalable processors and Inte…☆730Feb 11, 2026Updated 4 months ago
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆112Jun 28, 2025Updated last year
- SMT-LIB benchmarks for shape computations from deep learning models in PyTorch☆18Dec 21, 2022Updated 3 years ago