Example code for Intel AVX / AVX2 intrinsics.
☆145Sep 18, 2023Updated 2 years ago
Alternatives and similar repositories for AVX-AVX2-Example-Code
Users that are interested in AVX-AVX2-Example-Code are comparing it to the libraries listed below
Sorting:
- Short examples illustrating AVX2 intrinsics for simple tasks.☆98Mar 13, 2024Updated last year
- Rebuild YatSenOS On RISC-V 64.☆22Jan 6, 2022Updated 4 years ago
- A simple demonstration of how PyTorch autograd works☆16Sep 23, 2021Updated 4 years ago
- A Method for efficiently processing SpMV using SIMD and load balancing☆17Apr 4, 2022Updated 3 years ago
- Documentation for YatCPU☆54Nov 15, 2023Updated 2 years ago
- Parallelized and vectorized SpMV on Intel Xeon Phi (Knights Landing, AVX512, KNL)☆24Feb 12, 2024Updated 2 years ago
- A test library for computing modular exponentiation in parallel using AVX-512 vector arithmetic☆12Dec 18, 2023Updated 2 years ago
- ☆10May 21, 2020Updated 5 years ago
- Example code for Intel AVX / AVX2 intrinsics.☆21Oct 6, 2018Updated 7 years ago
- AVX-optimized sin(), cos(), exp() and log() functions☆128Jan 15, 2022Updated 4 years ago
- Portable wrapper for SIMD and vector instructions written in C++11. Compatible with NEON, SSE, AVX, AVX-512 and SVE (length specific).☆518Dec 4, 2025Updated 2 months ago
- This repo contains LaTeX template for experiment report.☆11Aug 17, 2021Updated 4 years ago
- a pytorch implementation of Google GEDLoss☆32Dec 9, 2020Updated 5 years ago
- follow NVIDIA, simplify it and support data parallel.☆13Sep 26, 2019Updated 6 years ago
- Containerization and deployment scripts for remote-index-server and workflows to generate monolithic index for github.com/llvm/llvm-proje…☆25Updated this week
- QCD for Intel Xeon Phi and Xeon processors☆14Mar 20, 2024Updated last year
- 中山大学计算机网络实验 (2019 春) :配置实验、编程实验、“小溪网”理论练习题☆50Dec 17, 2020Updated 5 years ago
- I would like to share my collection and my homework in SYSU Computer Science courses and elected courses.☆68Jul 6, 2023Updated 2 years ago
- Universal Presentation: A Header-only C++ Library to Cout STL containers and more☆18Aug 14, 2023Updated 2 years ago
- Platypus Educational Samples☆23May 21, 2021Updated 4 years ago
- CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution☆17Jun 25, 2023Updated 2 years ago
- how to design cpu gemm on x86 with avx256, that can beat openblas.☆73Apr 15, 2019Updated 6 years ago
- ☆16Apr 11, 2022Updated 3 years ago
- ☆21Dec 22, 2025Updated 2 months ago
- ☆1,992Jul 29, 2023Updated 2 years ago
- SpMV using CUDA☆20Mar 5, 2018Updated 7 years ago
- 中山大学机器人导论 (2019 秋):基于 Arduino 和树莓派的简单智能小车☆18Feb 27, 2021Updated 5 years ago
- 🌈 Path tracer implemented in OCaml based on "Ray Tracing in One Weekend"☆20May 18, 2022Updated 3 years ago
- Using C++ magic to capture CUDA kernels and tune them with Kernel Tuner☆21Sep 12, 2025Updated 5 months ago
- progressive path tracer written in taichi☆23Jun 15, 2022Updated 3 years ago
- Source code for the CPU-Free model - a fully autonomous execution model for multi-GPU applications that completely excludes the involveme…☆22Apr 25, 2024Updated last year
- A GPU FP32 computation method with Tensor Cores.☆26Dec 8, 2025Updated 2 months ago
- Yet another toy CPU.☆93Dec 10, 2023Updated 2 years ago
- This is a tuned sparse matrix dense vector multiplication(SpMV) library☆22Mar 21, 2016Updated 9 years ago
- OS with Rust and UEFI☆17Jan 8, 2023Updated 3 years ago
- A CPU tool for benchmarking the peak of floating points☆579Feb 7, 2026Updated 3 weeks ago
- ☆98Feb 10, 2017Updated 9 years ago
- Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.☆163Feb 3, 2022Updated 4 years ago
- ☆152Jan 9, 2025Updated last year