GPU Cluster Monitoring (GCM): Large-Scale AI Research Cluster Monitoring
☆221Apr 14, 2026Updated this week
Alternatives and similar repositories for gcm
Users that are interested in gcm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Backend for skillgraph - a skill based framework for building agents that work.☆32Nov 10, 2025Updated 5 months ago
- Prometheus collector and exporter for Slurm cluster metrics. A Slinky project.☆16Nov 7, 2025Updated 5 months ago
- ☆26Apr 12, 2026Updated last week
- Whisper finetuning☆16Apr 9, 2025Updated last year
- Python client for the Run:ai REST API☆24Dec 15, 2025Updated 4 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆18Jun 10, 2024Updated last year
- Python Bash emulation for agents, a port of vercel-labs/just-bash☆40Feb 19, 2026Updated 2 months ago
- JFC! What a hot mess. *Scream into void*☆13Sep 20, 2021Updated 4 years ago
- Rust on ESP32 "Hello World" app. A demo binary crate for the ESP32[XX] and ESP-IDF, which connects to WiFi, drives a small HTTP server an…☆10Oct 1, 2021Updated 4 years ago
- ☆13Apr 7, 2026Updated last week
- ☆14Feb 14, 2025Updated last year
- On-device AI SDK for Flutter — LLM inference, vision, STT, TTS, image generation, embeddings, RAG, and function calling. Metal GPU on iOS…☆80Mar 17, 2026Updated last month
- PrintQueue: Performance Diagnosis via Queue Measurement in the Data Plane☆19Jun 23, 2023Updated 2 years ago
- A graduate course on distributed systems☆17Jul 17, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆283Updated this week
- Sentiment Analysis implemented using Gluon and MXNet☆11May 12, 2018Updated 7 years ago
- Mantis: Reactive Programmable Switches (SIGCOMM 2020)☆22Aug 21, 2020Updated 5 years ago
- Enabling High Quality Real-Time Communications with Adaptive Frame-Rate (USENIX NSDI 2023)☆23Jan 5, 2024Updated 2 years ago
- Deploying EFA in EKS utilizing GPUDirectRDMA where supported☆36Oct 15, 2024Updated last year
- This repository provides a comprehensive benchmark for evaluating the performance of neural watermarking techniques. The benchmark includ…☆26Jan 9, 2026Updated 3 months ago
- CUPTI based GPU profiling library exposing usdt hooks☆29Updated this week
- An agent for CUDA compute-communication kernel co-design☆34Mar 24, 2026Updated 3 weeks ago
- [Deprecated and unmaintained] Uses boto to retrieve current spot instance prices on Amazon EC2.☆19May 28, 2019Updated 6 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Proposed plumbing commands for cargo☆22Updated this week
- NNVM for ROCm Examples☆19Nov 22, 2017Updated 8 years ago
- Code implementation for the paper "Large-scale Pre-training for Grounded Video Caption Generation" (ICCV 2025)☆30Jan 18, 2026Updated 3 months ago
- ☆18Oct 8, 2018Updated 7 years ago
- Deploy your own private OpenAI-compatible LLM☆27Jun 5, 2025Updated 10 months ago
- MXNet implementation of Graph Convolutional Neural Networks☆20Oct 8, 2018Updated 7 years ago
- Official Implementation of Knowledge Flow Prompting☆35Oct 20, 2025Updated 5 months ago
- Developing a legal research tool leveraging ChatGPT / GPT-4☆14Mar 10, 2024Updated 2 years ago
- how to build a sentence embedding application using BentoML☆14Mar 31, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆16May 2, 2023Updated 2 years ago
- ☆68Mar 21, 2025Updated last year
- ☆25May 23, 2025Updated 10 months ago
- API serving for your diffusers models☆11Jan 19, 2024Updated 2 years ago
- Implemention of Capsule Net from the paper Dynamic Routing Between Capsules☆24Nov 12, 2017Updated 8 years ago
- ☆23Jun 26, 2024Updated last year
- An AI-powered GitHub search tool utilising Generative UI☆14Jul 20, 2024Updated last year