Light-weight Performance Variance Detection for Production-run Parallel Applications
☆16Aug 28, 2023Updated 2 years ago
Alternatives and similar repositories for VAPRO
Users that are interested in VAPRO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A GPU FP32 computation method with Tensor Cores.☆26Dec 8, 2025Updated 3 months ago
- C-Coupler2: a flexible and user-friendly community coupler for model coupling and nesting☆39Sep 4, 2019Updated 6 years ago
- A portable and efficient infrastracture for value profilers. Doc: https://vclinic.readthedocs.io/en/latest/index.html☆14Mar 4, 2026Updated 2 weeks ago
- ☆15Dec 26, 2022Updated 3 years ago
- Anticipating Invariant☆12Mar 14, 2014Updated 12 years ago
- tensorflow fork with Salus integration☆12Jan 7, 2022Updated 4 years ago
- Fortran IO Netcdf Assembly☆19Sep 12, 2021Updated 4 years ago
- Official implementation of Acc-SpMM: Accelerating General-purpose Sparse Matrix-Matrix Multiplication with GPU Tensor Cores.☆14Nov 13, 2025Updated 4 months ago
- This is the open-source site for XFDetector (ASPLOS'20)☆11Mar 5, 2021Updated 5 years ago
- Domain-specific framework for performance analysis of parallel programs☆16Feb 11, 2026Updated last month
- Sequence-level 1F1B schedule for LLMs.☆19Jun 4, 2024Updated last year
- C ABCI libraries☆14Sep 10, 2017Updated 8 years ago
- Sample code and application to simplifying onboarding new hosts to the network with DNA Center☆14Dec 8, 2022Updated 3 years ago
- DragonEgg has been migrated to GCC 8 and LLVM 6 but also able to work for GCC 4.8 and LLVM 3.3☆20Apr 29, 2019Updated 6 years ago
- Test cases for MIPS CPU implementation☆12Dec 26, 2019Updated 6 years ago
- NVIDIA DPU OPs collection☆15Mar 6, 2023Updated 3 years ago
- NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer☆170Feb 11, 2026Updated last month
- ☆13Jan 23, 2021Updated 5 years ago
- Create beegfs server and client☆24Dec 2, 2021Updated 4 years ago
- Sampled simulation of multi-threaded applications using LoopPoint methodology☆24Feb 21, 2026Updated last month
- ☆23Mar 31, 2012Updated 13 years ago
- Tools and library to manipulate EFI variables.☆10Mar 17, 2026Updated last week
- Einsum optimization using opt_einsum and PyTorch FX graph rewriting☆22Mar 17, 2022Updated 4 years ago
- Extending the HDF5 library to support intelligent I/O buffering for deep memory and storage hierarchy systems☆34Feb 17, 2025Updated last year
- ☆24Nov 27, 2025Updated 3 months ago
- Open source version of DOCA GPUNetIO and DOCA Verbs libraries (limited features) to enable GDAKI technology on RDMA (IB and RoCE)☆34Updated this week
- 📝 "Synthesizing Benchmarks for Predictive Modeling" (🥇 CGO'17 Best Paper)☆22Feb 10, 2023Updated 3 years ago
- ☆22Jan 10, 2023Updated 3 years ago
- [READ ONLY] Refer to gitlab repo for updated version - Total Knowledge of I/O Reference Implementation. Please see wiki for contribution…☆22May 18, 2022Updated 3 years ago
- Forked from https://bitbucket.org/berkeleylab/cs-roofline-toolkit/src/master/☆25Jun 14, 2019Updated 6 years ago
- ProtoText is a efficient python library, offering dict-like operations and text format serialization to google protobuf objects.☆20Jan 15, 2020Updated 6 years ago
- Linux kernel driver to export the TSC frequency via sysfs☆54Sep 24, 2019Updated 6 years ago
- Used for testing the metadata performance of a file system☆25Nov 29, 2017Updated 8 years ago
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆125Jun 23, 2022Updated 3 years ago
- ☆41Jun 5, 2024Updated last year
- verbs profiling library☆22Sep 22, 2023Updated 2 years ago
- Rust LLVM Practises☆17Dec 29, 2020Updated 5 years ago
- Blog program use the framework from liaoxuefeng.☆23Aug 1, 2015Updated 10 years ago
- Repository for artifact evaluation of ASPLOS 2023 paper "SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning"☆25Feb 24, 2023Updated 3 years ago