meta-pytorch / triton-cpuView external linksLinks
An experimental CPU backend for Triton (https//github.com/openai/triton)
☆49Aug 18, 2025Updated 5 months ago
Alternatives and similar repositories for triton-cpu
Users that are interested in triton-cpu are comparing it to the libraries listed below
Sorting:
- An experimental CPU backend for Triton☆175Nov 10, 2025Updated 3 months ago
- ☆21Mar 3, 2025Updated 11 months ago
- OpenAI Triton backend for Intel® GPUs☆226Updated this week
- SMT-LIB benchmarks for shape computations from deep learning models in PyTorch☆18Dec 21, 2022Updated 3 years ago
- Repository for go shared libraries (for now).☆11Dec 1, 2025Updated 2 months ago
- Automatic differentiation for Triton Kernels☆29Aug 12, 2025Updated 6 months ago
- Benchmarking PyTorch 2.0 different models☆20Mar 19, 2023Updated 2 years ago
- train with kittens!☆63Oct 25, 2024Updated last year
- SYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs☆66Updated this week
- AI-ML-NLP Task Group☆13Aug 10, 2023Updated 2 years ago
- FP4 MAC Array☆19Apr 14, 2024Updated last year
- Enhanced Reverberation As Supervision (ERAS) for unsupervised reverberant speech separation☆15Aug 1, 2024Updated last year
- [INTERSPEECH 2024] Official code for VoxSim: A perceptual voice similarity dataset☆12Sep 29, 2025Updated 4 months ago
- Cuda extensions for PyTorch☆12Dec 2, 2025Updated 2 months ago
- Experiment of using Tangent to autodiff triton☆82Jan 22, 2024Updated 2 years ago
- Acceleration codes for the Ozaki-scheme on integer matrix multiplication units.☆19Dec 10, 2025Updated 2 months ago
- Triton for OpenCL backend, and use mlir-translate to get source OpenCL code☆24Aug 27, 2025Updated 5 months ago
- Repository for AI model benchmarking on TT-Buda☆15Updated this week
- RESPECT: Reinforcement Learning based Edge Scheduling on Pipelined Coral Edge TPUs (DAC'23)☆11Apr 13, 2023Updated 2 years ago
- ☆12Jan 4, 2024Updated 2 years ago
- Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction☆13Jul 22, 2024Updated last year
- Minimal, dependency free implementation of the ctor crate☆17Aug 1, 2024Updated last year
- Artifacts of EVT ASPLOS'24☆29Mar 6, 2024Updated last year
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- TPP experimentation on MLIR for linear algebra☆144Feb 2, 2026Updated last week
- Trying to deconstruct RWKV in understandable terms☆14May 6, 2023Updated 2 years ago
- Simple implementation of a GPT (training and inference) in PyTorch.☆13Dec 11, 2023Updated 2 years ago
- ☆14Apr 24, 2024Updated last year
- Shared Middle-Layer for Triton Compilation☆326Dec 5, 2025Updated 2 months ago
- ☆15Jul 3, 2025Updated 7 months ago
- Slides and exercises for persistent memory programming tutorial☆14Nov 14, 2022Updated 3 years ago
- Spectral Mapping of Singing Voices: U-Net-Assisted Vocal Segmentation☆13Dec 12, 2024Updated last year
- ☆16Dec 19, 2024Updated last year
- Inference Llama 2 with a model compiled to native code by TorchInductor☆14Feb 8, 2024Updated 2 years ago
- ☆104Nov 7, 2024Updated last year
- An MLIR-based toy DL compiler for TVM Relay.☆61Oct 16, 2022Updated 3 years ago
- Writing FLUX in Triton☆41Sep 22, 2024Updated last year
- ☆18Mar 18, 2024Updated last year
- Towards a million-node RISC-V cluster.☆14Mar 6, 2025Updated 11 months ago