bcarlet / ptx-math
☆14Updated 2 years ago
Alternatives and similar repositories for ptx-math:
Users that are interested in ptx-math are comparing it to the libraries listed below
- ☆56Updated last month
- Embedded Universal DSL: a good DSL for us, by us☆36Updated this week
- LLVM-Canon aims to transform LLVM modules into a canonical form by reordering and renaming instructions while preserving the same semanti…☆15Updated last year
- Some experiments with SMT solvers and GIMPLE IR☆74Updated last week
- Support for ternary logic in SSE, XOP, AVX2 and x86 programs☆31Updated 4 months ago
- A header-only C++ library for writing compiler/interpreter frontends.☆14Updated last month
- immintrin_dbg.h is an include file, a wrapper around immintrin.h. It implements most of AVX, AVX2, AVX-512 vector intrinsics to enable so…☆57Updated 2 years ago
- ☆76Updated this week
- Library with JIT (Just-in-time) compilation support to optimize performance of small and medium matrix multiplication☆14Updated 4 years ago
- UB-aware interpreter for LLVM debugging☆27Updated last week
- The Farm-SVE package provides a header that implements the ARM C language extensions (ACLE) for the ARM Scalable Vector Extension (SVE) i…☆14Updated last year
- Fork of LLVM for demonstrating optimization pass development☆31Updated 2 years ago
- ASM methods to test small loop performance on x86☆13Updated 5 years ago
- ☆29Updated 2 years ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆50Updated last month
- A minimal (really) out-of-tree MLIR example☆44Updated last week
- InstLatX64_Demo☆43Updated this week
- A utility library to bridge llvm and mlir gaps.☆13Updated 4 months ago
- ☆38Updated 3 years ago
- ☆20Updated 2 years ago
- Information about AVX-512 support on recent Intel processors☆45Updated 3 years ago
- The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.☆38Updated 3 years ago
- CDSChecker: A Model Checker for C11 and C++11 Atomics☆29Updated 11 years ago
- CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.☆119Updated 2 years ago
- CERE: Codelet Extractor and REplayer☆40Updated last year
- A GPU-based LZSS compression algorithm, highly tuned for NVIDIA GPGPUs and for streaming data, leveraging the respective strengths of CPU…☆35Updated 9 years ago
- A fast implementation of log() and exp()☆53Updated 2 years ago
- Automatic Binary Parallelisation☆43Updated 2 months ago
- Declarative MLIR compilers in Python!☆33Updated 4 years ago
- SIMDized check which bytes are in a set☆28Updated 6 years ago