☆74Jun 29, 2023Updated 3 years ago
Alternatives and similar repositories for nvvmir-samples
Users that are interested in nvvmir-samples are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Enabling on-the-fly manipulations with LLVM IR code of CUDA sources☆124Apr 18, 2025Updated last year
- Scalable GPU Kernel Fission/Fusion Transformation for Memory-Bound Kernels☆14Aug 26, 2015Updated 10 years ago
- cuASR: CUDA Algebra for Semirings☆49Aug 22, 2022Updated 3 years ago
- LLVM Plugin to Instrument Global Memory Accesses in CUDA Kernels☆10Jun 8, 2020Updated 6 years ago
- ☆67Oct 10, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- D bindings and wrapper library for the MXNet deep learning library☆14Sep 11, 2019Updated 6 years ago
- CUPTI GPU Profiler☆39Feb 26, 2019Updated 7 years ago
- Project ARES represents a joint effort between LANL and ORNL to introduce a common compiler representation and tool-chain for HPC applica…☆10Nov 30, 2016Updated 9 years ago
- outline and links for PLDI 2022 tutorial☆17Jun 13, 2022Updated 4 years ago
- ☆84Nov 16, 2020Updated 5 years ago
- GLSL code generator to aid use of Vulkan's descriptor set indexing☆14Apr 20, 2019Updated 7 years ago
- Multiple 1-stencil implementations using nvidia cuda.☆12Dec 2, 2017Updated 8 years ago
- ☆19Nov 21, 2022Updated 3 years ago
- some RL algorithms☆19Dec 9, 2016Updated 9 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ngAP's artifact for ASPLOS'24☆25Jul 29, 2025Updated 11 months ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆33Apr 2, 2025Updated last year
- Generate simple index ranges in C++ and CUDA C++☆39Jun 14, 2023Updated 3 years ago
- Colby Hall's C++ Standard Library☆11Jan 13, 2020Updated 6 years ago
- GPUOCelot: A dynamic compilation framework for PTX☆287Jul 31, 2023Updated 2 years ago
- Sample programs for the LLVM PTX back-end☆41Aug 27, 2015Updated 10 years ago
- Multi-GPU dynamic scheduler using PGAS style cross-GPU communication☆29Jul 23, 2023Updated 2 years ago
- The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.☆45Oct 25, 2021Updated 4 years ago
- Source code of the simulator used in the Mosaic paper from MICRO 2017: "Mosaic: A GPU Memory Manager with Application-Transparent Support…☆50Aug 21, 2018Updated 7 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- The SHOC Benchmark Suite☆262Oct 6, 2025Updated 8 months ago
- A numerical optimisation and deep learning framework for D.☆29Jul 31, 2018Updated 7 years ago
- GPGPU-SIM 使用篇☆14Nov 12, 2022Updated 3 years ago
- ☆17Oct 15, 2023Updated 2 years ago
- ☆20Feb 21, 2022Updated 4 years ago
- crossplatform work with serial port☆22Nov 21, 2022Updated 3 years ago
- Fast Point Overlap Test☆19Jun 17, 2018Updated 8 years ago
- GPGPU-Sim provides a detailed simulation model of a contemporary GPU running CUDA and/or OpenCL workloads and now includes an integrated…☆15Jun 24, 2020Updated 6 years ago
- Adobe's C++ Performance Benchmarks for modern compilers (and build systems)☆12Aug 3, 2019Updated 6 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆33Mar 15, 2021Updated 5 years ago
- A decentralized unique ID generator (int64)☆22Jun 15, 2016Updated 10 years ago
- Julia ports of the Rodinia benchmark suite for heterogeneous computing infrastructures☆56Aug 15, 2023Updated 2 years ago
- ☆27Mar 26, 2025Updated last year
- LHCSim is a 3D physics simulation engine developed based on taichi☆17Jul 20, 2022Updated 3 years ago
- A GPU FP32 computation method with Tensor Cores.☆27Dec 8, 2025Updated 6 months ago
- Material and work for O'Reilly courses and publications☆11May 19, 2020Updated 6 years ago