libxsmm / tpp-pytorch-extensionLinks
Intel® Tensor Processing Primitives extension for Pytorch*
☆17Updated 2 weeks ago
Alternatives and similar repositories for tpp-pytorch-extension
Users that are interested in tpp-pytorch-extension are comparing it to the libraries listed below
Sorting:
- ☆63Updated 10 months ago
- SYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs☆41Updated this week
- OpenAI Triton backend for Intel® GPUs☆211Updated this week
- ☆108Updated last year
- collection of benchmarks to measure basic GPU capabilities☆429Updated 8 months ago
- Dissecting NVIDIA GPU Architecture☆109Updated 3 years ago
- ☆83Updated 2 years ago
- Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA☆35Updated 5 years ago
- ☆148Updated 5 months ago
- Assembler for NVIDIA Volta and Turing GPUs☆230Updated 3 years ago
- Development repository for the Triton-Linalg conversion☆202Updated 8 months ago
- ☆24Updated 3 years ago
- ☆50Updated 6 years ago
- Microsoft Collective Communication Library☆363Updated 2 years ago
- Code samples related to Intel(R) AMX☆39Updated last year
- A home for the final text of all TVM RFCs.☆108Updated last year
- ☆33Updated last year
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆119Updated this week
- Shared Middle-Layer for Triton Compilation☆289Updated last week
- ☆286Updated 3 weeks ago
- NCCL Examples from Official NVIDIA NCCL Developer Guide.☆19Updated 7 years ago
- ☆53Updated 3 months ago
- ☆18Updated 7 months ago
- Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators☆116Updated 2 years ago
- TACOS: [T]opology-[A]ware [Co]llective Algorithm [S]ynthesizer for Distributed Machine Learning☆26Updated 4 months ago
- ☆154Updated last year
- Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators☆472Updated this week
- ☆153Updated 9 months ago
- ☆32Updated 3 years ago
- This is the top-level repository for the Accel-Sim framework.☆491Updated last week