Musings in GEMM (General Matrix Multiplication)
☆14Dec 14, 2025Updated 3 months ago
Alternatives and similar repositories for gemm
Users that are interested in gemm are comparing it to the libraries listed below
Sorting:
- PolyLib official git.☆11Jan 27, 2026Updated last month
- A little library for using SIMD instructions for x86 and ARM, wrapping Agner Fog's vectorclass for x86 and filling some of its functional…☆17Dec 10, 2021Updated 4 years ago
- learn llvm from scratch☆14Apr 29, 2023Updated 2 years ago
- GEMM☆10Aug 26, 2023Updated 2 years ago
- N-body simulation based on CUDA.☆14Jun 20, 2019Updated 6 years ago
- Implementation of DeepMind's "Sobolev Training for Neural Networks"☆11Apr 2, 2018Updated 7 years ago
- Flat sorted array with very fast insert and erase operations☆18Sep 26, 2025Updated 5 months ago
- Android Tv 焦点移动飞框的实现☆19Sep 26, 2017Updated 8 years ago
- Quantum annealing for traveling salesman problem☆11Jul 19, 2018Updated 7 years ago
- Docker image with Intel Parallel Studio XE Composer Edition for C++☆16Aug 18, 2018Updated 7 years ago
- ☆14Mar 10, 2026Updated last week
- 中文版 Parallel Programming for FPGAs☆15Nov 20, 2019Updated 6 years ago
- A Zig Shell☆13Jul 25, 2025Updated 7 months ago
- Microbenchmarks and Google Benchmark library☆25Aug 5, 2024Updated last year
- Training neural networks with target propagation☆13May 19, 2017Updated 8 years ago
- ☆15Dec 9, 2018Updated 7 years ago
- Write games use jok(zig) through MoonBit(wasm).☆16Mar 24, 2025Updated 11 months ago
- core WebGPU shaders☆15Aug 18, 2024Updated last year
- My Collection of Raspberry Pi projects☆20May 26, 2021Updated 4 years ago
- Use your bluetooth device in Linux (Ubuntu) and Windows without having to pair it on every boot.☆11Apr 12, 2021Updated 4 years ago
- Reference: LEE Kisung and WON Youjip "Smart layers and dumb result: Io characterization of an android-based smartphone" In EMSOFT 2012: …☆24Oct 19, 2016Updated 9 years ago
- A simple implementation of convolutional networks in Matlab☆10Mar 3, 2015Updated 11 years ago
- JAX bindings for the flash-attention3 kernels☆21Jan 2, 2026Updated 2 months ago
- Disk and file system benchmark, especially intended for flash storage☆23Feb 24, 2015Updated 11 years ago
- Demo: Slightly More Bio-Plausible Backprop☆21Mar 9, 2017Updated 9 years ago
- A Vulkan driver to stream commands over TCP☆24Nov 15, 2025Updated 4 months ago
- 网络测速 包含 网络延时,上下行速度,基于OkHttp3.0☆28Apr 28, 2017Updated 8 years ago
- Running linear algebra as fast as possible on Apple silicon☆28Aug 18, 2023Updated 2 years ago
- The link to the stored-in-image imagenet64x64 dataset. And a resnet/wrn code for it.☆15Aug 24, 2022Updated 3 years ago
- Benchmarks of various systems using Box2D☆67Jun 4, 2016Updated 9 years ago
- ☆11Feb 28, 2023Updated 3 years ago
- ☆22Mar 23, 2023Updated 2 years ago
- (WIP) trying to port LKL to wasm☆26Feb 28, 2024Updated 2 years ago
- Unofficial pytorch implementation of ReZero in ResNet☆24Mar 29, 2020Updated 5 years ago
- Container and system event tracing using eBPF☆35Feb 17, 2026Updated last month
- Input-aware cuBLAS/clBLAS implementation for better performance☆17Aug 4, 2022Updated 3 years ago
- This is a linear algebra library written using MoonBit, aiming to fill the gap in scientific computing applications in the MoonBit ecosys…☆14Feb 4, 2026Updated last month
- Full data science workflows on the web☆21Apr 25, 2019Updated 6 years ago
- ☆36Updated this week