NascentCore / 3kLinks
3-k platform is for training LLMs
☆14Updated last month
Alternatives and similar repositories for 3k
Users that are interested in 3k are comparing it to the libraries listed below
Sorting:
- InfiniBand SR-IOV CNI☆13Updated last week
- ☆63Updated this week
- NVIDIA NCCL Tests for Distributed Training☆92Updated this week
- Prometheus exporter for a Infiniband Fabric☆59Updated last year
- ☆58Updated 4 years ago
- ☆28Updated last year
- RDMA CNI plugin for containerized workloads☆52Updated 3 weeks ago
- Intelligent platform for AI workloads☆37Updated 2 years ago
- m3fs(Make 3FS) is the toolset designed to deploy 3FS cluster.☆36Updated 2 weeks ago
- Health checks for Azure N- and H-series VMs.☆42Updated last month
- The BeeGFS Container Storage Interface (CSI) driver provides high performing and scalable storage for workloads running in Kubernetes. 📦…☆68Updated last month
- Hooked CUDA-related dynamic libraries by using automated code generation tools.☆156Updated last year
- ☆20Updated 7 months ago
- ☆62Updated 4 months ago
- ☆62Updated last week
- A prefill & decode disaggregated LLM serving framework with shared GPU memory and fine-grained compute isolation.☆81Updated 2 weeks ago
- An HPC and Cloud Computing Fused Job Scheduling System☆101Updated this week
- Device-plugin for volcano vgpu which support hard resource isolation☆79Updated last week
- GPU scheduler for elastic/distributed deep learning workloads in Kubernetes cluster (IC2E'23)☆34Updated last year
- A distributed engine for intelligent workload☆26Updated 3 months ago
- A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod☆125Updated 3 years ago
- An efficient GPU resource sharing system with fine-grained control for Linux platforms.☆83Updated last year
- ☆42Updated last year
- Magnum IO community repo☆95Updated 3 weeks ago
- This repository provides installation scripts and configuration files for deploying the CSGHub instance, includes Helm charts and Docker…☆16Updated last week
- A novel temporal fusion framework for propelling autoregressive model inference☆11Updated this week
- A TUI-based utility for real-time monitoring of InfiniBand traffic and performance metrics on the local node☆19Updated last week
- The API (CRD) of Volcano☆42Updated this week
- Ubuntu kernels which are optimized for NVIDIA server systems☆38Updated this week
- A tool to detect infrastructure issues on cloud native AI systems☆36Updated 2 weeks ago