A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches
☆15Jun 21, 2019Updated 6 years ago
Alternatives and similar repositories for klap
Users that are interested in klap are comparing it to the libraries listed below
Sorting:
- Evaluating different memory managers for dynamic GPU memory☆26Dec 16, 2020Updated 5 years ago
- ☆11Aug 4, 2022Updated 3 years ago
- ☆28Aug 14, 2024Updated last year
- Horizontal Fusion☆24Jan 7, 2022Updated 4 years ago
- ☆11Jun 29, 2021Updated 4 years ago
- This is the proof-of-concept CPU implementation of ASPEN used for the NeurIPS'23 paper ASPEN: Breaking Operator Barriers for Efficient Pa…☆13Apr 4, 2024Updated last year
- ☆18Mar 4, 2025Updated last year
- library which simplifies host-GPU data transfer using userspace pagefault handling☆15Jun 8, 2012Updated 13 years ago
- An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.☆50Jul 23, 2024Updated last year
- An Attention Superoptimizer☆22Jan 20, 2025Updated last year
- A synthesis flow for hybrid processing-in-RRAM modes☆12Jul 15, 2021Updated 4 years ago
- ☆14Mar 4, 2015Updated 11 years ago
- TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.☆14Nov 23, 2024Updated last year
- PTX-EMU is a simple emulator for CUDA program.☆38Apr 25, 2025Updated 10 months ago
- GoPTX: Fine-grained GPU Kernel Fusion by PTX-level Instruction Flow Weaving☆20Jul 30, 2025Updated 7 months ago
- ☆18Apr 21, 2024Updated last year
- code for privacy-preserving sat solver☆17Jul 14, 2023Updated 2 years ago
- GKLEE is a symbolic analyser and test generator tailored for CUDA C++ programs☆16Dec 12, 2014Updated 11 years ago
- Memory consistency model checking and test generation library.☆16Oct 14, 2016Updated 9 years ago
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆19Feb 24, 2026Updated last week
- Cross platform Instant Outbidding Bot, Instant Outbidder Bot is designed to outbid all real-time bids within a second by percentage incre…☆100Jan 17, 2023Updated 3 years ago
- Artifact for PPoPP20 "Understanding and Bridging the Gaps in Current GNN Performance Optimizations"☆41Nov 16, 2021Updated 4 years ago
- ☆36Jun 10, 2024Updated last year
- A repository that compliments gpgpu-sim, providing automated regression scripts, simulation launching utilities and the code + arguments …☆75Aug 22, 2020Updated 5 years ago
- ☆20May 30, 2024Updated last year
- ☆38Jun 27, 2025Updated 8 months ago
- ☆20Mar 1, 2021Updated 5 years ago
- ☆20Sep 28, 2024Updated last year
- Code for ACL2022 publication Transkimmer: Transformer Learns to Layer-wise Skim☆22Aug 21, 2022Updated 3 years ago
- Performance Prediction Toolkit☆56Sep 13, 2025Updated 5 months ago
- A GPU-accelerated DNN inference serving system that supports instant kernel preemption and biased concurrent execution in GPU scheduling.☆43May 29, 2022Updated 3 years ago
- GVProf: A Value Profiler for GPU-based Clusters☆53Mar 24, 2024Updated last year
- DIVINE model checker git mirror, https://divine.fi.muni.cz. This is a read-only mirror of the main darcs repository. Issues should be rep…☆22Mar 21, 2021Updated 4 years ago
- RTLCheck☆25Oct 9, 2018Updated 7 years ago
- Process Orchestration Framework: A camunda 7 fork☆21Updated this week
- SMT solver for the theory of floating-point arithmetic☆25Jan 30, 2018Updated 8 years ago
- [USENIX ATC 2021] Exploring the Design Space of Page Management for Multi-Tiered Memory Systems☆48Mar 31, 2022Updated 3 years ago
- ngAP's artifact for ASPLOS'24☆25Jul 29, 2025Updated 7 months ago
- Open source release from our ICLR 2020 paper, CLN2INV: Learning Loop Invariants with Continuous Logic Networks.☆21Jun 8, 2020Updated 5 years ago