Sichek is a tool for detecting and diagnosing node-level issues in AI environments, ensuring the reliability and high performance of GPU-intensive workloads. It proactively identifies hardware and software problems, and triggers automated corrective actions, including task retries and operational maintenance timely
☆25May 13, 2026Updated last week
Alternatives and similar repositories for sichek
Users that are interested in sichek are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Arks is a cloud-native inference framework running on Kubernetes☆47Jan 14, 2026Updated 4 months ago
- ☆14Feb 14, 2025Updated last year
- IFCB data system, generation 2☆10Apr 13, 2026Updated last month
- zabbix批量导入监控主机☆10Feb 2, 2015Updated 11 years ago
- Orchestrating many small GPU clusters for running serverless GPU workloads☆17Mar 15, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Analyzes generic firewall rules and detects conflicts and anomalies.☆17Jan 25, 2025Updated last year
- Validating Network Deployments with NAPALM☆11Feb 1, 2018Updated 8 years ago
- This is an example script that leverages pyATS to lookup the switch interfaces where MAC Addresses are located.☆12Jan 4, 2021Updated 5 years ago
- An opinionated open source deployment of jupyterhub based on an Slurm job scheduler.☆30Sep 30, 2024Updated last year
- Files and materials for the "Hands-On Practical Network Automation" workshop at Interop ITX 2017 in Las Vegas, NV☆14Feb 17, 2021Updated 5 years ago
- A set of programs to download, upload, convert, analyze and create a policy for FortiGate firewalls☆15Mar 6, 2025Updated last year
- Jinja2 based configuration generator with some extensions required to generate configurations for network devices. It's build on top of …☆19Oct 11, 2017Updated 8 years ago
- ☆16Jun 17, 2021Updated 4 years ago
- 通过netconf协议操作h3c交换机,可实现增删静态路由条目等功能☆13May 26, 2017Updated 8 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- COCCL: Compression and precision co-aware collective communication library☆31Mar 16, 2025Updated last year
- Abstract all the things☆14Jun 18, 2016Updated 9 years ago
- Netbox_joined_inventory is a python script that gathers data from a Netbox source-of-truth and stores them as Ansible inventory, group_va…☆22Jul 29, 2020Updated 5 years ago
- Kafka extension for Nameko framework☆18Jul 12, 2023Updated 2 years ago
- 根据配置模版,批量自动化生成交换机配置☆15Mar 1, 2018Updated 8 years ago
- Stubbing out and documenting FastAPI, VueJS 3 and Docker workflow.☆19Jan 11, 2021Updated 5 years ago
- Chinese-whisper 聚类算法(由于涉及公司代码保护,只显示文档)☆12Apr 18, 2018Updated 8 years ago
- ☆76Oct 25, 2025Updated 6 months ago
- Import Interfaces and IP Address into Netbox☆17Sep 26, 2019Updated 6 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Python bindings for Augeas☆44Oct 26, 2023Updated 2 years ago
- Mellanox userland tools and scripts☆144May 10, 2026Updated last week
- A TUI-based utility for real-time monitoring of InfiniBand traffic and performance metrics on the local node☆67Updated this week
- Custom scripts for NetBox (DCIM/IPAM)☆24Oct 29, 2018Updated 7 years ago
- Benchmarking PyTorch 2.0 different models☆20Mar 19, 2023Updated 3 years ago
- SOTA benchmark☆18Aug 8, 2023Updated 2 years ago
- etcd v3 client with hierarchy☆28Mar 6, 2022Updated 4 years ago
- Simple Go 1.8 plugin test for https://jeremywho.com/go-1.8---plugins/☆10Feb 28, 2017Updated 9 years ago
- 一个结合aep的kv存储☆14Jan 18, 2021Updated 5 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Notes and course material for MATH50003 Numerical Analysis (2021–2022)☆49Mar 12, 2024Updated 2 years ago
- IPChain Core Wallet☆24Aug 5, 2019Updated 6 years ago
- Audit your acl of network device☆33Aug 5, 2020Updated 5 years ago
- (已废弃) 项目内容已迁移到:☆14Mar 6, 2019Updated 7 years ago
- disk usage for IBM Storage Scale file systems☆12Apr 20, 2026Updated last month
- eGO - Enlightening Golang☆15Jan 29, 2018Updated 8 years ago
- This is the Microsoft Azure Data center Network monitoring latency system and visualization system.It is composed of three : agent & con…☆28Jun 4, 2018Updated 7 years ago