NascentCore / 3k
3-k platform is for training LLMs
☆14Updated 3 weeks ago
Alternatives and similar repositories for 3k:
Users that are interested in 3k are comparing it to the libraries listed below
- Prometheus exporter for a Infiniband Fabric☆59Updated last year
- NVIDIA NCCL Tests for Distributed Training☆88Updated this week
- InfiniBand SR-IOV CNI☆13Updated last week
- GPU Stress Test is a tool to stress the compute engine of NVIDIA Tesla GPU’s by running a BLAS matrix multiply using different data types…☆88Updated last week
- ☆62Updated last week
- A collection of useful Go libraries to ease the development of NVIDIA Operators for GPU/NIC management.☆24Updated this week
- Intelligent platform for AI workloads☆37Updated 2 years ago
- RDMA CNI plugin for containerized workloads☆52Updated last week
- InfiniBand SR-IOV CNI☆47Updated last week
- ☆42Updated 11 months ago
- ☆49Updated this week
- This repository provides installation scripts and configuration files for deploying the CSGHub instance, includes Helm charts and Docker…☆14Updated last week
- ☆28Updated 2 months ago
- A tool to detect infrastructure issues on cloud native AI systems☆31Updated last month
- Transparent checkpoint/restart library for CUDA application.☆12Updated 10 years ago
- AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and ver…☆236Updated last week
- Golang bindings for Nvidia Datacenter GPU Manager (DCGM)☆109Updated 3 weeks ago
- ☆48Updated 7 months ago
- Tools for monitoring NVIDIA GPUs on Linux☆9Updated 5 years ago
- ☆11Updated this week
- Testing if I can implement slurm in an operator☆14Updated 5 months ago
- Fast and efficient attention method exploration and implementation.☆21Updated last month
- Device-plugin for volcano vgpu which support hard resource isolation☆73Updated this week
- Ubuntu kernels which are optimized for NVIDIA server systems☆37Updated this week
- ☆58Updated 4 years ago
- Resource Exporter for volcano scheduling, e.g. NUMA-Aware scheduling.☆17Updated 6 months ago
- Bitfusion with Kubernetes Integration Support☆50Updated last year
- A model serving framework for various research and production scenarios. Seamlessly built upon the PyTorch and HuggingFace ecosystem.☆23Updated 6 months ago
- Magnum IO community repo☆89Updated 3 months ago
- ☆30Updated this week