scalable-analyses/sme

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/scalable-analyses/sme)

scalable-analyses / sme

☆36

Alternatives and similar repositories for sme

Users that are interested in sme are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tzakharko / m4-sme-exploration
View on GitHub
Exploring the scalable matrix extension of the Apple M4 processor
☆235Nov 7, 2024Updated last year
wudu98 / autoGEMM
View on GitHub
☆15Dec 5, 2024Updated last year
RIKEN-RCCS / accelerator_for_ozIMMU
View on GitHub
Acceleration codes for the Ozaki-scheme on integer matrix multiplication units.
☆26Dec 10, 2025Updated 7 months ago
xrq-phys / blis_apple
View on GitHub
BLIS fork with kernels for Apple M1. (Perhaps) The first open-source BLAS with Apple Matrix Coprocessor support.
☆36Jan 7, 2023Updated 3 years ago
kaleid-liner / epoll-web-server
View on GitHub
A simple yet high performance web server written with epoll and pure c.
☆19Jun 7, 2019Updated 7 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
caoting-dotcom / multiBranchModel
View on GitHub
Multi-branch model for concurrent execution
☆18Jun 27, 2023Updated 3 years ago
aws-samples / amazon-braket-community-detection
View on GitHub
Community detection in complex networks using hybrid quantum annealing on Amazon Braket
☆13Jul 6, 2023Updated 3 years ago
cispa / ShadowLoad
View on GitHub
☆14Apr 1, 2025Updated last year
microsoft / TileIR
View on GitHub
☆31Feb 28, 2025Updated last year
NVIDIA / grace-cpu-benchmarking-guide
View on GitHub
Guides and examples to help achieve optimal performance on a NVIDIA Grace CPU
☆17Aug 9, 2024Updated last year
nullplay / Unified-Convolution-Framework
View on GitHub
☆10Apr 24, 2023Updated 3 years ago
ZJU-SEC / CrossFire
View on GitHub
☆20Nov 7, 2024Updated last year
ninthDevilHAUNSTER / ecc_learning
View on GitHub
shaobaobaoer 的椭圆曲线密码学习之路
☆14Mar 25, 2019Updated 7 years ago
enp1s0 / ozIMMU
View on GitHub
FP64 equivalent GEMM by the Ozaki scheme with Int8 Tensor Cores
☆125Dec 2, 2025Updated 7 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
khaki3 / ptxas-wrapper
View on GitHub
A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code
☆16Mar 19, 2023Updated 3 years ago
dougallj / asil
View on GitHub
☆36Jun 15, 2026Updated last month
RIKEN-RCCS / GEMMul8
View on GitHub
GEMMul8 (GEMMulate): GEMM emulation and its extension to BLAS-like matrix operations using INT8/FP8 matrix engines based on the Ozaki Sch…
☆82Jul 12, 2026Updated last week
google-parfait / cvm-side-channel-analysis
View on GitHub
☆16Aug 12, 2025Updated 11 months ago
s3git / s3git-py
View on GitHub
Python module for s3git: git for Cloud Storage
☆11May 3, 2016Updated 10 years ago
nDIRECT / nDIRECT
View on GitHub
A direct convolution library targeting ARM multi-core CPUs.
☆12Nov 27, 2024Updated last year
isec-tugraz / marvellous-attacks
View on GitHub
Attacks on Jarvis and Friday
☆10Oct 9, 2019Updated 6 years ago
opencca / opencca
View on GitHub
OpenCCA: An Open Framework to Enable Arm CCA Research
☆23Sep 10, 2025Updated 10 months ago
renesas-rz / rz_linux-cip
View on GitHub
☆16Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
tgmattso / OmpCommonCore
View on GitHub
Software to support people learning OpenMP with our book ... The OpenMP Common Core: Making OpenMP Simple Again
☆83Nov 12, 2023Updated 2 years ago
MachineLearningSystem / 25ASPLOS-Medusa
View on GitHub
Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]
☆12Nov 8, 2024Updated last year
microsoft / T-MAC
View on GitHub
Low-bit LLM inference on CPU/NPU with lookup table
☆973Jun 5, 2025Updated last year
armfazh / redox-ecc
View on GitHub
Elliptic curves Rust reference implementation
☆16Mar 10, 2024Updated 2 years ago
ahoi-attacks / WeSee
View on GitHub
Using Malicious #VC Interrupts to Break AMD SEV-SNP (IEEE S&P 2024)
☆26Apr 22, 2024Updated 2 years ago
INT-FlashAttention2024 / INT-FlashAttention
View on GitHub
☆91Jan 23, 2025Updated last year
arm-hpc-devkit / nvidia-arm-hpc-devkit-users-guide
View on GitHub
Get started with your NVIDIA Arm HPC Developers Kit!
☆33Feb 16, 2023Updated 3 years ago
PanZaifeng / FastTree-Artifact
View on GitHub
☆32Mar 24, 2025Updated last year
hao-ai-lab / flash-attention-fp4
View on GitHub
NVFP4 Flash-Attention 4 on BlackWell
☆30Updated this week
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
soypat / lap
View on GitHub
linear algebra package. like gonum/mat, but small. lets say gonum-lite
☆12Jul 8, 2023Updated 3 years ago
BuildIt-lang / buildit-array
View on GitHub
A numpy like array programming language optimized with BuildIt
☆14Oct 17, 2025Updated 9 months ago
powderluv / mm_benchmarks
View on GitHub
☆12Dec 31, 2020Updated 5 years ago
intel / AMX-TMUL-Code-Samples
View on GitHub
Code samples related to Intel(R) AMX
☆38Apr 8, 2024Updated 2 years ago
abperiasamy / pinata
View on GitHub
Play blindfold chess against any UCI compatible engines.
☆12Dec 4, 2023Updated 2 years ago
jiepengwang / MMGen
View on GitHub
☆17Apr 17, 2025Updated last year
UDC-GAC / polybench-python
View on GitHub
PolyBench/Python is the reimplementation of PolyBench in the Python programming language. It is a benchmark suite of 30 numerical computa…
☆10Feb 23, 2021Updated 5 years ago