MegEngine/cutlass-bak

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MegEngine/cutlass-bak)

MegEngine / cutlass-bak

modified cutlass

☆16

Alternatives and similar repositories for cutlass-bak

Users that are interested in cutlass-bak are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SwaggasDeCatas / emuThreeDS
View on GitHub
World's first Nintendo 3DS emulator for Apple devices based on Citra.
☆18Apr 7, 2023Updated 3 years ago
manishucsd / py-codegen
View on GitHub
☆16Sep 24, 2024Updated last year
codyjrivera / tsm2x-imp
View on GitHub
Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA
☆35Jul 28, 2020Updated 5 years ago
MegEngine / cutlass
View on GitHub
CUDA Templates for Linear Algebra Subroutines
☆101Apr 25, 2024Updated 2 years ago
pku-liang / popa
View on GitHub
A unified programming framework for high and portable performance across FPGAs and GPUs
☆11Mar 23, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
wongsingfo / paper-util
View on GitHub
Utilities for paper writing.
☆12Jan 11, 2026Updated 6 months ago
ROCm / MISA
View on GitHub
Machine Intelligence Shader Autogen. AMDGPU ML shader code generator. (previously iGEMMgen)
☆36Jul 30, 2025Updated 11 months ago
PENGUINLIONG / graphi-t
View on GitHub
Handy tools & graphics API abstraction for blazing fast prototyping
☆10Jan 17, 2024Updated 2 years ago
tensorcast-ai / tensorcast
View on GitHub
The high-performance distributed tensor layer — load once, share everywhere.
☆30Jun 23, 2026Updated 3 weeks ago
HicrestLaboratory / SPARTA
View on GitHub
SParse AcceleRation on Tensor Architecture
☆18Apr 15, 2026Updated 3 months ago
yester31 / Cutlass_EX
View on GitHub
study of cutlass
☆22Nov 10, 2024Updated last year
UCLA-VAST / heterohalide
View on GitHub
HeteroHalide: From Image Processing DSL to Efficient FPGA Acceleration
☆15Sep 14, 2020Updated 5 years ago
UofT-EcoSystem / BPPSA-open
View on GitHub
The (open-source part of) code to reproduce "BPPSA: Scaling Back-propagation by Parallel Scan Algorithm".
☆13Jun 7, 2021Updated 5 years ago
YashasSamaga / ConvolutionBuildingBlocks
View on GitHub
GEMM and Winograd based convolutions using CUTLASS
☆28Jul 15, 2020Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
LeiWang1999 / Stream-k.tvm
View on GitHub
☆20Sep 28, 2024Updated last year
TiledTensor / TiledKernel
View on GitHub
TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.
☆19May 12, 2024Updated 2 years ago
xiezhq-hermann / Algorithm-problem-set-solution
View on GitHub
Homework solutions to 2017 Fall Algorithm Courses in ShanghaiTech
☆10Jan 5, 2018Updated 8 years ago
tdietert / lambda-pi
View on GitHub
A toy implementation of the dependently typed lambda calculus known as λΠ
☆12Jan 29, 2020Updated 6 years ago
Lysxia / quickcheck-higherorder
View on GitHub
QuickCheck extension for higher-order properties
☆19Feb 14, 2022Updated 4 years ago
megvii-research / basedet
View on GitHub
An object detection codebase based on MegEngine.
☆28Dec 14, 2022Updated 3 years ago
nihui / ncnn_on_xr806
View on GitHub
☆15Dec 16, 2021Updated 4 years ago
stganser / polyite
View on GitHub
Polyite: Iterative Schedule Optimization for Parallelization in the Polyhedron Model
☆12Jan 19, 2020Updated 6 years ago
mochi-hpc / mochi-thallium
View on GitHub
Thallium is a C++14 library wrapping Margo, Mercury, and Argobots and providing an object-oriented way to use these libraries.
☆16May 4, 2026Updated 2 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
zouyonghao / DistFuzz
View on GitHub
Blackbox Fuzzing of Distributed Systems with Multi-Dimensional Inputs and Symmetry-Based Feedback Pruning
☆13Mar 7, 2025Updated last year
josehu07 / summerset
View on GitHub
Distributed, Replicated, Protocol-generic Key-value Store in Async Rust for SMR Protocols Research
☆18Updated this week
zhuohaoyu / ORPS
View on GitHub
☆15Jul 15, 2025Updated last year
Chen-Binghao / PilotFish
View on GitHub
PilotFish harvests the free GPU cycles of cloud gaming with deep learning training
☆14Jul 2, 2022Updated 4 years ago
vkrasnov / vpmadd
View on GitHub
Multiplication using AVX512 and AVX512IFMA instructions
☆25Nov 9, 2015Updated 10 years ago
drewhannay / paxos
View on GitHub
A simulator for the Paxos Protocol for consensus in distributed systems
☆21Dec 19, 2012Updated 13 years ago
hpcgarage / cuASR
View on GitHub
cuASR: CUDA Algebra for Semirings
☆49Aug 22, 2022Updated 3 years ago
pku-liang / Cement
View on GitHub
The Next-gen Language & Compiler Powering Efficient Hardware Design
☆38Jan 16, 2025Updated last year
wkqscut / DCGNet
View on GitHub
The code for IJCAI 2019 paper "Deep Cascade Generation on Point Sets"
☆14Oct 3, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
BUAA-CI-LAB / GNN-Feature-Decomposition
View on GitHub
Using Feature Decomposition method to accelerate GNN inference
☆13Sep 27, 2021Updated 4 years ago
XiuYuLi / deepcore_source_code
View on GitHub
Subpart source code of of deepcore v0.7
☆27Jun 28, 2020Updated 6 years ago
uwsampl / paper-agents
View on GitHub
☆13Dec 9, 2024Updated last year
wongsingfo / pku-grad-thesis
View on GitHub
北京大学本科生毕业论文 latex 模版，基于 pkuthss 1.9.0 修改
☆32May 15, 2022Updated 4 years ago
CRobeck / instrument-amdgpu-kernels
View on GitHub
LLVM/MLIR based compiler instrumentation of AMD GPU kernels
☆21Jul 13, 2025Updated last year
NVlabs / mixedproxy
View on GitHub
☆15Nov 14, 2023Updated 2 years ago
sebfisch / incremental-sat-solver
View on GitHub
Simple, Incremental SAT Solving as a Haskell Library
☆15Aug 31, 2016Updated 9 years ago