NVIDIA / BobberLinks
Containerized testing of system components that impact AI workload performance
☆15Updated 2 years ago
Alternatives and similar repositories for Bobber
Users that are interested in Bobber are comparing it to the libraries listed below
Sorting:
- Python bindings for UCX☆139Updated 4 months ago
- Deep Learning Benchmarking Suite☆130Updated 3 years ago
- MLCube® is a project that reduces friction for machine learning by ensuring that models are easily portable and reproducible.☆158Updated 2 months ago
- RAPIDS GPU-BDB☆108Updated last year
- This repository contains the results and code for the MLPerf™ Training v0.7 benchmark.☆57Updated 2 years ago
- This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.☆203Updated last week
- TensorFlow-nGraph bridge☆136Updated 4 years ago
- A top-like tool for monitoring GPUs in a cluster☆84Updated last year
- Tools to deploy GPU clusters in the Cloud☆34Updated 2 years ago
- NVIDIA's launch, startup, and logging scripts used by our MLPerf Training and HPC submissions☆35Updated 4 months ago
- MLFlow Deployment Plugin for Ray Serve☆46Updated 3 years ago
- Scheduling GPU cluster workloads with Slurm☆78Updated 7 years ago
- Enhanced networking support for TensorFlow. Maintained by SIG-networking.☆98Updated 4 years ago
- CloudAI Benchmark Framework☆82Updated this week
- NGC Container Replicator☆28Updated 3 years ago
- Inference Model Manager for Kubernetes☆46Updated 6 years ago
- Incubating project for xgboost operator☆77Updated 4 years ago
- oneCCL Bindings for Pytorch* (deprecated)☆104Updated last month
- Benchmarks to capture important workloads.☆32Updated 2 weeks ago
- Computation using data flow graphs for scalable machine learning☆68Updated this week
- The Triton backend for the PyTorch TorchScript models.☆173Updated this week
- GraphDef Editor: A port of the TensorFlow contrib.graph_editor package that operates over serialized graphs☆31Updated 3 years ago
- 3rd party dependencies for DALI project☆11Updated this week
- Issues related to MLPerf® Inference policies, including rules and suggested changes☆63Updated this week
- Python bindings for NVTX☆67Updated 2 years ago
- ☆384Updated last year
- A TUI-based utility for real-time monitoring of InfiniBand traffic and performance metrics on the local node☆63Updated last month
- A multi-user, distributed computing environment for running DL model training experiments on Intel® Xeon® Scalable processor-based system…☆393Updated last year
- WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.☆17Updated 3 years ago
- A tensor-aware point-to-point communication primitive for machine learning☆283Updated last month