HabanaAI / tpc_llvm
TPC-CLANG compiler that compiles a TPC C programming language which is used in HabanaLabs Deep-Learning Accelerators
☆26Updated last month
Alternatives and similar repositories for tpc_llvm:
Users that are interested in tpc_llvm are comparing it to the libraries listed below
- An Open Source Kepler GPU Assembler☆20Updated 7 years ago
- A framework that support executing unmodified CUDA source code on non-NVIDIA devices.☆111Updated last week
- ☆56Updated last week
- ☆48Updated 5 years ago
- Bandwidth test for ROCm☆52Updated 3 weeks ago
- IREE plugin repository for the AMD AIE accelerator☆71Updated this week
- This project records the process of optimizing SGEMM (single-precision floating point General Matrix Multiplication) on the riscv platfor…☆18Updated last month
- Bridging polyhedral analysis tools to the MLIR framework☆106Updated last year
- A repository that compliments gpgpu-sim, providing automated regression scripts, simulation launching utilities and the code + arguments …☆68Updated 4 years ago
- ☆131Updated this week
- ☆40Updated 4 years ago
- ☆51Updated 5 years ago
- Performance Prediction Toolkit☆51Updated 3 weeks ago
- SynapseAI Core is a reference implementation of the SynapseAI API running on Habana Gaudi☆38Updated 2 years ago
- Flexible GPGPU instrumentation☆86Updated 5 years ago
- GPUDirect Async support for IB Verbs☆91Updated 2 years ago
- Data-Centric MLIR dialect☆39Updated last year
- Trying to figure various CPU things out☆73Updated 10 months ago
- a simple end to end example of taking a ML graph (TF2 / PyTorch) and running it on a device [cpu, gpu]☆29Updated 3 years ago
- The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github…☆32Updated last month
- TransferBench is a utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs)☆38Updated this week
- assembler for NVIDIA FERMI. Imported from Google Code☆71Updated 9 years ago
- MLIRX is now defunct. Please see PolyBlocks - https://docs.polymagelabs.com☆38Updated last year
- Implement asm gemm on vega64 for 4096x4096 fp32 matrix☆21Updated 5 years ago
- HeteroSync is a benchmark suite for performing fine-grained synchronization on tightly coupled GPUs☆27Updated 3 months ago
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆48Updated 2 weeks ago
- ☆23Updated 5 years ago
- GPUDirect example☆58Updated 3 years ago
- Decuda and cudasm, the CUDA binary utilities package. Low-level tools for NVidia G80 GPUs.☆97Updated 14 years ago
- ROB size testing utility☆140Updated 3 years ago