gty111/PTX-EMU

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/gty111/PTX-EMU)

gty111 / PTX-EMU

PTX-EMU is a simple emulator for CUDA program.

☆40

Alternatives and similar repositories for PTX-EMU

Users that are interested in PTX-EMU are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

illinois-impact / klap
View on GitHub
A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches
☆15Jun 21, 2019Updated 7 years ago
SYSU-SCC / sysu-scc-spack-repo
View on GitHub
Spack package repository maintained by Student Cluster Competition Team @ Sun Yat-sen University.
☆16Aug 20, 2025Updated 11 months ago
Nelson-Cheung / yatsenos-riscv
View on GitHub
Rebuild YatSenOS On RISC-V 64.
☆23Jan 6, 2022Updated 4 years ago
TiledTensor / TiledKernel
View on GitHub
TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.
☆19May 12, 2024Updated 2 years ago
howardlau1999 / hcache-uring
View on GitHub
2022 ECS CloudBuild Distributed Cache Contest - Final Round https://tianchi.aliyun.com/competition/entrance/531982/introduction
☆17Dec 8, 2022Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
vortexgpgpu / Volt
View on GitHub
☆18Feb 9, 2026Updated 5 months ago
TiledTensor / TiledBench
View on GitHub
Benchmark tests supporting the TiledCUDA library.
☆19Nov 19, 2024Updated last year
stellaraccident / mlir-py-release
View on GitHub
☆13Jul 9, 2021Updated 5 years ago
OpenPPL / CuAssembler
View on GitHub
An unofficial cuda assembler, for all generations of SASS, hopefully ：）
☆85Mar 20, 2023Updated 3 years ago
feifeibear / ChituAttention
View on GitHub
Quantized Attention on GPU
☆45Nov 22, 2024Updated last year
ProjectPhysX / PTXprofiler
View on GitHub
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
☆59Mar 20, 2025Updated last year
zhuzilin / flash-attention-with-sink
View on GitHub
☆37Aug 7, 2025Updated 11 months ago
arcsysu / SYSU-ARCH
View on GitHub
SYSU-ARCH is a LAB that focuses on the use and extending of simulators.
☆10Dec 19, 2022Updated 3 years ago
YJMSTR / flash-linear-attention
View on GitHub
FLA but cuTile
☆27Apr 17, 2026Updated 3 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ademeure / DeeperGEMM
View on GitHub
DeeperGEMM: crazy optimized version
☆86May 5, 2025Updated last year
YdrMaster / cuda-driver
View on GitHub
基于 CUDA Driver API 的 cuda 运行时环境
☆16Jul 30, 2025Updated 11 months ago
marcosamaris / gpuperfpredict
View on GitHub
Predict Performance of GPU Applications using analytical model and Machine Learning
☆11Aug 31, 2022Updated 3 years ago
wu-kan / GoPTX
View on GitHub
GoPTX: Fine-grained GPU Kernel Fusion by PTX-level Instruction Flow Weaving
☆21Jul 30, 2025Updated 11 months ago
MLSysU / EcoServe
View on GitHub
[OSDI' 26] Efficient LLM Serving on Commodity GPU Clusters with Data-Reduced Cross-Instance Orchestration
☆23Jul 5, 2026Updated 3 weeks ago
KuangjuX / cuda-evolve-oss
View on GitHub
Autonomous GPU kernel optimization system driven by AI agents.
☆31Mar 29, 2026Updated 4 months ago
alan-hpc / cuda_op_benchmark
View on GitHub
方便扩展的Cuda算子理解和优化框架，仅用在学习使用
☆18Jun 13, 2024Updated 2 years ago
0xD0GF00D / DocumentSASS
View on GitHub
Unofficial description of the CUDA assembly (SASS) instruction sets.
☆225Jul 18, 2025Updated last year
hplp / aes_chisel
View on GitHub
Implementation of the Advanced Encryption Standard in Chisel
☆19Apr 18, 2022Updated 4 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
NVIDIA / cuEmbed
View on GitHub
CUDA Embedding Lookup Kernel Library
☆48Jun 26, 2026Updated last month
project-flexos / asplos22-ae
View on GitHub
FlexOS: Towards Flexible OS Isolation (ASPLOS'22) Artifact Evaluation Repository
☆19Apr 2, 2022Updated 4 years ago
JohndeVostok / APE
View on GitHub
A GPU FP32 computation method with Tensor Cores.
☆27Dec 8, 2025Updated 7 months ago
faasm / faasmjs
View on GitHub
Serverless browser offloading with Faasm and WebAssembly
☆16Feb 14, 2022Updated 4 years ago
khaki3 / ptxas-wrapper
View on GitHub
A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code
☆16Mar 19, 2023Updated 3 years ago
aoli-al / HFuse
View on GitHub
Horizontal Fusion
☆24Jan 7, 2022Updated 4 years ago
sjfeng1999 / gpu-arch-microbenchmark
View on GitHub
Dissecting NVIDIA GPU Architecture
☆126Jul 11, 2022Updated 4 years ago
wzh99 / relay-mlir
View on GitHub
An MLIR-based toy DL compiler for TVM Relay.
☆62Oct 16, 2022Updated 3 years ago
ConvolutedDog / HyFiSS
View on GitHub
HyFiSS: A Hybrid Fidelity Stall-Aware Simulator for GPGPUs
☆42Dec 9, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
HanGuo97 / hilt
View on GitHub
☆40Dec 14, 2025Updated 7 months ago
polymage-labs / mlirx
View on GitHub
MLIRX is now defunct. Please see PolyBlocks - https://docs.polymagelabs.com
☆39Dec 1, 2023Updated 2 years ago
UCLA-VAST / heterohalide
View on GitHub
HeteroHalide: From Image Processing DSL to Efficient FPGA Acceleration
☆15Sep 14, 2020Updated 5 years ago
vortexgpgpu / NVPTX-SPIRV-Translator
View on GitHub
The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.
☆45Oct 25, 2021Updated 4 years ago
ahmedheakl / CASS
View on GitHub
[ACL 2026 🔥] CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark
☆35Apr 20, 2026Updated 3 months ago
zartbot / gfd
View on GitHub
GPU Functional Descriptor for memory access
☆34May 24, 2026Updated 2 months ago
HydraQYH / hp_rms_norm
View on GitHub
High performance RMSNorm Implement by using SM Core Storage(Registers and Shared Memory)
☆30Jan 22, 2026Updated 6 months ago