enp1s0/ozIMMU

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/enp1s0/ozIMMU)

enp1s0 / ozIMMU

FP64 equivalent GEMM by the Ozaki scheme with Int8 Tensor Cores

☆125

Alternatives and similar repositories for ozIMMU

Users that are interested in ozIMMU are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RIKEN-RCCS / accelerator_for_ozIMMU
View on GitHub
Acceleration codes for the Ozaki-scheme on integer matrix multiplication units.
☆26Dec 10, 2025Updated 7 months ago
RIKEN-RCCS / GEMMul8
View on GitHub
GEMMul8 (GEMMulate): GEMM emulation and its extension to BLAS-like matrix operations using INT8/FP8 matrix engines based on the Ozaki Sch…
☆83Jul 12, 2026Updated last week
enp1s0 / cuMpSGEMM
View on GitHub
Fast SGEMM emulation on Tensor Cores
☆17Feb 16, 2025Updated last year
ayosprakob / grassmanntn
View on GitHub
A python package for Grassmann tensor network computation
☆18Dec 12, 2024Updated last year
Katagiri-Hoshino-Lab / VibeCodeHPC
View on GitHub
CLI-based multi-agents for Auto-Tuning (e.g. HPC code optimazation loops) supporting Local LLMs
☆39Mar 25, 2026Updated 4 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
ChASE-library / ChASE
View on GitHub
This repository mirrors the principal Gitlab repository of the Chebyshev Accelerated Subspace iteration Eigensolver. If you want to contr…
☆20Jul 8, 2026Updated 2 weeks ago
nakatamaho / dgemm_tutorial
View on GitHub
A gemm_tutorial
☆38May 31, 2025Updated last year
scalable-analyses / sme
View on GitHub
☆36Mar 31, 2025Updated last year
wmmae / wmma_extension
View on GitHub
An extension library of WMMA API (Tensor Core API)
☆115Jul 12, 2024Updated 2 years ago
Deepdive543443 / Benchncnn-3DS
View on GitHub
Benchmark your NCNN models on 3DS(or crash)
☆10Apr 15, 2024Updated 2 years ago
PENGUINLIONG / graphi-t
View on GitHub
Handy tools & graphics API abstraction for blazing fast prototyping
☆10Jan 17, 2024Updated 2 years ago
ORNL-QCI / exatn
View on GitHub
Hierarchical Tensor Networks at Exascale
☆69Jul 24, 2023Updated 3 years ago
NVIDIA / HMM_sample_code
View on GitHub
CUDA 12.2 HMM demos
☆21Jul 26, 2024Updated last year
zhongjingjogy / use-eigen-with-cmake
View on GitHub
Usage of Eigen library with CMake.
☆17Sep 18, 2018Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
herumi / s_xbyak
View on GitHub
ASM generation tool for GAS/NASM/MASM with Xbyak-like syntax in Python
☆13Nov 10, 2025Updated 8 months ago
variPEPS / variPEPS_Python
View on GitHub
variPEPS -- Versatile tensor network library for variational ground state simulations in two spatial dimensions
☆19Jul 9, 2026Updated 2 weeks ago
IDSIA / rtrl-elstm
View on GitHub
Official repository for the paper "Exploring the Promise and Limits of Real-Time Recurrent Learning" (ICLR 2024)
☆13Jun 11, 2025Updated last year
bdusell / stack-attention
View on GitHub
Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
☆18Mar 15, 2024Updated 2 years ago
yikangshen / megablocks
View on GitHub
☆20May 30, 2024Updated 2 years ago
eth-cscs / COSTA
View on GitHub
Distributed Communication-Optimal Shuffle and Transpose Algorithm
☆14Apr 18, 2026Updated 3 months ago
MLSysU / EcoServe
View on GitHub
[OSDI' 26] Efficient LLM Serving on Commodity GPU Clusters with Data-Reduced Cross-Instance Orchestration
☆23Jul 5, 2026Updated 3 weeks ago
TiledTensor / TiledBench
View on GitHub
Benchmark tests supporting the TiledCUDA library.
☆19Nov 19, 2024Updated last year
jurajHasik / peps-torch
View on GitHub
Solving two-dimensional spin models with tensor networks (powered by PyTorch)
☆103May 6, 2026Updated 2 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
yastn / yastn
View on GitHub
Yet another symmetric tensor network
☆50Updated this week
mhauru / abeliantensors
View on GitHub
A library for Abelian symmetry preserving tensors in Python 3
☆24Apr 27, 2021Updated 5 years ago
wannier-utils-dev / cif2qewan
View on GitHub
☆18Dec 5, 2025Updated 7 months ago
vatai / tadashi
View on GitHub
A library for code transformations with guaranteed legality
☆18Jun 12, 2026Updated last month
nicejunjie / scilib-accel
View on GitHub
automatic GPU offload for scientific libraries
☆18Jun 17, 2026Updated last month
radarFudan / Curse-of-memory
View on GitHub
Curse-of-memory phenomenon of RNNs in sequence modelling
☆19May 8, 2025Updated last year
icl-utk-edu / slate
View on GitHub
SLATE is a distributed, GPU-accelerated, dense linear algebra library targetting current and upcoming high-performance computing (HPC) sy…
☆135Oct 21, 2025Updated 9 months ago
fujitsu / oneDNN
View on GitHub
oneAPI Deep Neural Network Library (oneDNN)
☆10Feb 2, 2022Updated 4 years ago
mizu-bai / ncnn-fortran
View on GitHub
Call ncnn from Fortran
☆18Dec 18, 2022Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
tk-rusch / unicornn
View on GitHub
Official code for UnICORNN (ICML 2021)
☆28Oct 1, 2021Updated 4 years ago
daisytuner / docc
View on GitHub
Daisytuner Optimizing Compiler Collection (docc)
☆22Updated this week
HazyResearch / train-tk
View on GitHub
train with kittens!
☆67Oct 25, 2024Updated last year
IBM / selective-dense-state-space-model
View on GitHub
Open-sourcing code associated with the AAAI-25 paper "On the Expressiveness and Length Generalization of Selective State-Space Models on …
☆16Sep 18, 2025Updated 10 months ago
jhpc-quantum / RIKEN-braket
View on GitHub
Software/library for simulations of quantum gates
☆22Updated this week
RIKEN-RCCS / RAPTOR
View on GitHub
☆19Jun 26, 2026Updated last month
gunrock / loops
View on GitHub
🎃 GPU load-balancing library for regular and irregular computations.
☆67Jun 25, 2026Updated last month