NVIDIA / Bobber
Containerized testing of system components that impact AI workload performance
☆14Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Bobber
- Imageinary is a reproducible mechanism which is used to generate large image datasets at various resolutions. The tool supports multiple …☆26Updated last year
- Test data for DALI project☆40Updated 3 weeks ago
- NGC Container Replicator☆28Updated last year
- 3rd party dependencies for DALI project☆10Updated last week
- Python bindings for UCX☆121Updated this week
- CloudAI Benchmark Framework☆38Updated this week
- Tools to deploy GPU clusters in the Cloud☆30Updated last year
- NVIDIA's launch, startup, and logging scripts used by our MLPerf Training and HPC submissions☆22Updated 3 weeks ago
- RAPIDS GPU-BDB☆107Updated 8 months ago
- This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.☆147Updated this week
- A Ray-based data loader with per-epoch shuffling and configurable pipelining, for shuffling and loading training data for distributed tra…☆18Updated last year
- Run cloud native workloads on NVIDIA GPUs☆134Updated this week
- pytorch ucc plugin☆17Updated 3 years ago
- MLPerf™ logging library☆30Updated this week
- OpenVINO backend for Triton.☆30Updated this week
- Computation using data flow graphs for scalable machine learning☆67Updated this week
- Scheduling GPU cluster workloads with Slurm☆74Updated 6 years ago
- ☆19Updated this week
- Scoreboard for ONNX Backend Compatibility☆27Updated this week
- Singularity implementation of k8s operator for interacting with SLURM.☆117Updated 3 years ago
- General policies for MLPerf™ including submission rules, coding standards, etc.☆28Updated this week
- A top-like tool for monitoring GPUs in a cluster☆81Updated 9 months ago
- ☆35Updated last year
- Optimized primitives for collective multi-GPU communication☆21Updated 7 months ago
- benchmarking some transformer deployments☆26Updated last year
- oneCCL Bindings for Pytorch*☆86Updated 3 weeks ago
- FIL backend for the Triton Inference Server☆72Updated this week
- Inference Model Manager for Kubernetes☆46Updated 5 years ago
- ☆311Updated 6 months ago