CoffeeBeforeArch / bits_of_architectureLinks
Slides from the "Bits of Architecture" series on YouTube
☆28Updated 3 years ago
Alternatives and similar repositories for bits_of_architecture
Users that are interested in bits_of_architecture are comparing it to the libraries listed below
Sorting:
- ☆125Updated 2 years ago
- X86 CPU topics overview for developers , oriented towards performance☆204Updated last month
- Omnitrace: Application Profiling, Tracing, and Analysis☆346Updated 3 weeks ago
- A profiler to disclose and quantify hardware features on GPUs.☆175Updated 3 years ago
- Learn LLVM 17, published by Packt☆211Updated last year
- Source code for 'Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL' by James Reinders, Ben A…☆285Updated 10 months ago
- Unofficial description of the CUDA assembly (SASS) instruction sets.☆200Updated 6 months ago
- Slides and other materials from CppCon 2022☆563Updated 5 months ago
- Demonstration of various hardware effects on CUDA GPUs.☆391Updated 2 years ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆57Updated 10 months ago
- MLIR-based toolkit targeting intel heterogeneous hardware☆51Updated this week
- Slides and other materials from CppCon2021☆118Updated 2 years ago
- My notes on various HPC papers.☆25Updated 3 years ago
- A framework that support executing unmodified CUDA source code on non-NVIDIA devices.☆141Updated last year
- Hexagon-MLIR is a compiler toolchain for compiling and executing AI kernels and models on Qualcomm Hexagon Neural Processing Units (NPUs)…☆27Updated this week
- Graphics Processing Unit (GPU) Architecture Guide☆272Updated 4 years ago
- Serial and parallel implementations of matrix multiplication☆45Updated 4 years ago
- C++ files from the "C++ Crash Course" YouTube series by CoffeeBeforeArch☆108Updated 3 years ago
- GPUOcelot: A dynamic compilation framework for PTX☆219Updated last year
- Advanced Matrix Extensions (AMX) Guide☆109Updated 4 years ago
- Tutorial on building a gpu compiler backend in LLVM☆53Updated last year
- ☆204Updated this week
- A repository where GPU applications are aggregated using a common build flow that supports multiple CUDA versions.☆91Updated 2 weeks ago
- MLIR Sample dialect☆136Updated last month
- SYCL Academy, a set of learning materials for SYCL heterogeneous programming☆527Updated 3 weeks ago
- Welcome to OptML! This repository is designed for those new to MLIR and machine learning-based optimizations. As a compiler enthusiast, I…☆20Updated last year
- A collection of performance analysis tools, recipes, handy scripts, microbenchmarks & more☆143Updated 7 months ago
- Simple OpenCL Samples that Build with Khronos Headers and Libs☆120Updated last week
- Slides and other materials from CppCon 2023☆336Updated last year
- A lightweight memory allocator for hardware-accelerated machine learning☆181Updated 4 months ago