lopentusska / slurm_ubuntu_gpu_cluster
Instructions for setting up a Slurm gpu cluster on Ubuntu 22.04.
☆15Updated 10 months ago
Alternatives and similar repositories for slurm_ubuntu_gpu_cluster:
Users that are interested in slurm_ubuntu_gpu_cluster are comparing it to the libraries listed below
- Instructions for setting up a SLURM cluster using Ubuntu 18.04.3 with GPUs.☆140Updated 4 years ago
- A Python library transfers PyTorch tensors between CPU and NVMe☆102Updated last month
- Elixir: Train a Large Language Model on a Small GPU Cluster☆13Updated last year
- ☆57Updated 7 months ago
- CloudAI Benchmark Framework☆47Updated this week
- LLM-Inference-Bench☆26Updated last week
- A parallel framework for training deep neural networks☆49Updated this week
- Example ML projects that use the Determined library.☆25Updated 4 months ago
- A minimal implementation of vllm.☆32Updated 5 months ago
- MLPerf™ logging library☆32Updated last week
- ☆62Updated last month
- NGC Container Replicator☆28Updated 2 years ago
- Container plugin for Slurm Workload Manager☆311Updated 2 months ago
- Intel Gaudi's Megatron DeepSpeed Large Language Models for training☆13Updated last month
- NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference☆62Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMs☆11Updated this week
- PyTorch library for cost-effective, fast and easy serving of MoE models.☆112Updated last month
- Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆107Updated last month
- Official repository for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers☆204Updated 4 months ago
- Distributed IO-aware Attention algorithm☆18Updated 4 months ago
- pytorch-profiler☆50Updated last year
- ☆114Updated 10 months ago
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆39Updated 2 months ago
- Modular and structured prompt caching for low-latency LLM inference☆83Updated 2 months ago
- [ICDCS 2023] DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining☆12Updated last year
- The CUDA target for Numba☆41Updated last week
- ☆25Updated last year
- A safetensors extension to efficiently store sparse quantized tensors on disk☆64Updated this week
- Performance benchmarking with ColossalAI☆39Updated 2 years ago
- CUDA 12.2 HMM demos☆19Updated 5 months ago