bacaldwell / scalable-monitoringLinks
Scripts for monitoring InfiniBand and storage devices
☆12Updated 10 years ago
Alternatives and similar repositories for scalable-monitoring
Users that are interested in scalable-monitoring are comparing it to the libraries listed below
Sorting:
- InfiniBand fabric monitoring daemon written in Go☆32Updated 4 months ago
- Prometheus exporter for a Infiniband Fabric☆67Updated last year
- OGRT Runtime Tracker☆11Updated 5 years ago
- Lustre administration tool☆24Updated 3 months ago
- HPC tests using MPI codes & synthetic benchmarks with IB/RoCE comparisions - from StackHPC Ltd.☆21Updated 3 years ago
- Lustre Monitoring System☆25Updated 7 months ago
- Lustre Monitoring Tools☆77Updated last week
- ☆19Updated 4 years ago
- HPCPerfStats (formerly TACC Stats) is an automated resource-usage monitoring and analysis package for HPC Clusters.☆49Updated last week
- A terminal based monitoring tool for InfiniBand networks using Detector (https://github.com/hhu-bsinfo/detector)☆14Updated 6 years ago
- Lustre Monitoring System based on Collectd, Grafana and Influxdb☆45Updated last year
- Prometheus exporter for use with the Lustre parallel filesystem☆41Updated 3 years ago
- Prometheus exporter for use with the Lustre parallel filesystem☆28Updated this week
- Custom Slurm tools☆25Updated 6 years ago
- Slurm Lua SPANK plugin☆16Updated 8 months ago
- Command-line tool to retrieve information and monitor Mellanox un-managed Infiniband switches☆67Updated 6 months ago
- A Slurm-based HPC workload management environment, driven by Ansible.☆65Updated last week
- stable lustre sources☆26Updated 5 years ago
- Some lustre-related scripts and utilities in use at LLNL.☆26Updated 6 months ago
- Ansible role for OpenHPC☆50Updated last week
- File utilities designed for scalability and performance.☆187Updated 2 months ago
- Converts an Infiniband topology file to graphviz dot format or slurm topology.conf format☆17Updated 8 months ago
- Prometheus exporter for the stats in the cgroup accounting with slurm. This will also collect stats of a job using NVIDIA GPUs.☆36Updated last month
- ☆28Updated last year
- This web portal is intended to give HPC users a view of the overall use of the HPC cluster and their own use.☆36Updated last month
- Grand Unified File-Index☆55Updated last week
- HPC dashboards developed for SRCC systems☆19Updated 3 years ago
- Pavilion is a Python 3 (3.5+) based framework for running and analyzing tests targeting HPC systems.☆46Updated last month
- Tools for MPI programmers☆14Updated 5 years ago
- SLURM Tools and UBiLities☆73Updated 3 years ago