nadavrot / bistra
Bistra is a domain-specific language designed to generate high-performance kernels (such as GEMMs, convolutions, etc). The program is designed to allow powerful compiler optimizations and code generation that are not possible in C. The tool can auto-tune GEMM kernels to around 90% of peak performance (on X86/AVX2) within seconds.
☆5Updated 6 months ago
Related projects: ⓘ
- A fast implementation of log() and exp()☆49Updated last year
- Collection of C++ containers extracted from LLVM☆26Updated 3 years ago
- ☆21Updated 2 years ago
- parser combinator and AST generator in c++17☆24Updated last year
- ☆24Updated 4 months ago
- ☆27Updated last year
- moderngpu algorithms for C++ shaders☆16Updated 3 years ago
- Bytecode interpreter☆67Updated 5 months ago
- performance experiments for C++ exception handling☆29Updated 2 years ago
- crefl is a runtime library and compiler plug-in to support reflection in C.☆37Updated 2 weeks ago
- Quick 'n' Dirty benchmarks for various integer parsing methods in C++☆40Updated 4 years ago
- libcubwt is a library for GPU accelerated suffix array and burrows wheeler transform construction.☆29Updated 7 months ago
- A vectorized single header hash function.☆17Updated last year
- Support for ternary logic in SSE, XOP, AVX2 and x86 programs☆30Updated 3 years ago
- Dynamic runtime inlining with LLVM☆65Updated 2 years ago
- Highly composable C++17 template meta programming library☆39Updated 5 years ago
- A benchmark for cache efficient data structures.☆28Updated 5 years ago
- Wyrm is a GCC GIMPLE to LLVM IR transpiler☆51Updated 7 months ago
- Workflows to build daily and ad hoc compilers for Compiler Explorer☆18Updated this week
- A fast, zero dependency, single-header WebAssembly interpreter☆34Updated 11 months ago
- SIMD recipes, for various platforms (collection of code snippets)☆47Updated 3 years ago
- A performant, parallel, probabilistic, random acyclic-graph, low-latency, perfect hash generation library.☆65Updated last week
- Improved NetBSD's Perfect Hash Generation Tool v3☆15Updated 4 months ago
- CPP20 implementation of a 16-bit floating-point type mimicking most of the IEEE 754 behavior. Single file and header-only.☆30Updated 7 months ago
- ☆47Updated 3 years ago
- Interchangeable AoS and SoA containers☆22Updated 2 years ago
- Visualize SIMD instructions☆33Updated last year
- ☆39Updated 3 years ago
- Lightweight framework for easy and efficient code generation☆95Updated last month
- Experimental patches to implement missing C++20 modules features for the clang/LLVM toolchain.☆23Updated 2 years ago