lightsighter/CudaDMA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lightsighter/CudaDMA)

lightsighter / CudaDMA

Emulating DMA Engines on GPUs for Performance and Portability

☆43

Alternatives and similar repositories for CudaDMA

Users that are interested in CudaDMA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lightsighter / Weft
View on GitHub
A Sound and Complete Verification Tool for Warp-Specialized GPU Kernels
☆19Jun 17, 2015Updated 11 years ago
ucb-bar / RoSE
View on GitHub
A unified simulation platform that combines hardware and software, enabling pre-silicon, full-stack, closed-loop evaluation of your robot…
☆47Jul 8, 2026Updated 2 weeks ago
denght23 / CAVER
View on GitHub
NS3 simulator for RDMA load balancing
☆12Jan 31, 2025Updated last year
lulinchen / FPGA_CryptoNight_V7
View on GitHub
FPGA CryptoNight V7 Minner
☆31Aug 26, 2019Updated 6 years ago
hsundar / dendro
View on GitHub
Parallel Algorithms for Octree Meshing
☆12Dec 31, 2015Updated 10 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
dame-cell / Triformer
View on GitHub
Transformers components but in Triton
☆34May 9, 2025Updated last year
owensgroup / gpustats
View on GitHub
Statistics on GPUs
☆33May 5, 2026Updated 2 months ago
FelixWinterstein / LEAP-HLS
View on GitHub
Rapid system integration of high-level synthesis kernels using the LEAP FPGA framework
☆12Apr 17, 2016Updated 10 years ago
SourceryTools / nvptx-tools
View on GitHub
nvptx-tools: a collection of tools for use with nvptx-none GCC toolchains.
☆53Apr 7, 2026Updated 3 months ago
TiledTensor / TiledLower
View on GitHub
TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.
☆13Nov 23, 2024Updated last year
azonenberg / sata-sniffer
View on GitHub
SATA sniffing
☆15Jul 28, 2022Updated 3 years ago
ucb-bar / virgo
View on GitHub
Cluster-level matrix unit integration into GPUs, implemented in Chipyard SoC
☆58Jan 20, 2026Updated 6 months ago
sparsh0mittal / destiny_3d_cache
View on GitHub
Source code for DESTINY, a tool for modeling 2D and 3D caches designed with SRAM, eDRAM, STT-RAM, ReRAM and PCM. This is mirror of follow…
☆27Dec 18, 2024Updated last year
wubinyi / Convolutional-Neural-Network-Accelerator
View on GitHub
Deep learning accelerator for convolutional layer (convolution operation) and fully-connected layer(matrix-multiplication).
☆20Nov 18, 2018Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
thu-ml / Jetfire-INT8Training
View on GitHub
☆63Jul 21, 2024Updated 2 years ago
NVlabs / cub
View on GitHub
THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.
☆87Feb 21, 2024Updated 2 years ago
lixiuhong / batched_gemm
View on GitHub
☆40Feb 28, 2020Updated 6 years ago
openucx / torch-ucc
View on GitHub
pytorch ucc plugin
☆23Jul 8, 2021Updated 5 years ago
wzc810049078 / ZC-RISCV-CORE
View on GitHub
ZC RISCV CORE
☆12Dec 19, 2019Updated 6 years ago
cchan / fp8_mul
View on GitHub
A tiny FP8 multiplication unit written in Verilog. TinyTapeout 2 submission.
☆14Nov 23, 2022Updated 3 years ago
SNU-ARC / OpenDNN
View on GitHub
OpenDNN: An Open-source, cuDNN-like Deep Learning Primitive Library
☆29Dec 9, 2019Updated 6 years ago
ademeure / cuda-side-boost
View on GitHub
☆60Feb 24, 2026Updated 4 months ago
hngenc / stellar
View on GitHub
☆35Nov 6, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
north-numerical-computing / tensor-cores-numerical-behavior
View on GitHub
Test suite for probing the numerical behavior of NVIDIA tensor cores
☆42Jul 24, 2024Updated last year
AnonymousYWL / MYLIB
View on GitHub
☆18Apr 8, 2022Updated 4 years ago
md2z34 / winograd_gpu
View on GitHub
GPU implementation of Winograd convolution
☆10Oct 23, 2017Updated 8 years ago
skiphansen / panog2_ldr
View on GitHub
Network based loader and flasher for Pano G2 devices
☆15Jul 8, 2023Updated 3 years ago
bryancatanzaro / trove
View on GitHub
Full-speed Array of Structures access
☆177Apr 25, 2023Updated 3 years ago
dawsonjon / chips_v
View on GitHub
RISC-V System on Chip Builder
☆12Sep 27, 2020Updated 5 years ago
AWB-Tools / awb
View on GitHub
Architect's workbench
☆11May 5, 2016Updated 10 years ago
papers-submission / structured_transposable_masks
View on GitHub
Code for ICML 2021 submission
☆35Mar 24, 2021Updated 5 years ago
UoB-HPC / minifmm
View on GitHub
☆11Aug 8, 2021Updated 4 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
rvgpu / rvgpu
View on GitHub
☆20Nov 4, 2024Updated last year
meton-robean / deca
View on GitHub
RocketChip RoCC Accelerator template (Risc-V, Chisel )(加速器开发项目框架)
☆15Sep 5, 2019Updated 6 years ago
vortexgpgpu / Volt
View on GitHub
☆17Feb 9, 2026Updated 5 months ago
google / nvidia_libs_test
View on GitHub
Tests and benchmarks for cudnn (and in the future, other nvidia libraries)
☆55Nov 20, 2020Updated 5 years ago
TiledTensor / TiledKernel
View on GitHub
TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.
☆19May 12, 2024Updated 2 years ago
mcrl / tccl
View on GitHub
Thunder Research Group's Collective Communication Library
☆53Jul 8, 2025Updated last year
owensgroup / GpuBTree
View on GitHub
Code for paper "Engineering a High-Performance GPU B-Tree" accepted to PPoPP 2019
☆57Jun 27, 2022Updated 4 years ago