discovery-unicamp / dasf-coreLinks
Framework for computing Machine Learning algorithms in Python using Dask and RAPIDS AI.
☆12Updated last week
Alternatives and similar repositories for dasf-core
Users that are interested in dasf-core are comparing it to the libraries listed below
Sorting:
- Repository for MLCommons Chakra schema and tools☆153Updated 3 months ago
- ☆198Updated 6 years ago
- MISO: Exploiting Multi-Instance GPU Capability on Multi-Tenant GPU Clusters☆20Updated 2 years ago
- Tools to help with LaTeX paper writing☆17Updated 2 years ago
- 🔮 Execution time predictions for deep neural network training iterations across different GPUs.☆63Updated 3 years ago
- Code for "Heterogenity-Aware Cluster Scheduling Policies for Deep Learning Workloads", which appeared at OSDI 2020☆137Updated last year
- Measure and optimize the energy consumption of your AI applications!☆332Updated last week
- LIBRA: Enabling Workload-aware Multi-dimensional Network Topology Optimization for Distributed Training of Large AI Models☆12Updated last year
- TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches☆80Updated 2 years ago
- ☆216Updated 2 months ago
- Magnum IO community repo☆109Updated 2 months ago
- Multi-GPU communication profiler and visualizer☆37Updated last year
- Multi-Instance-GPU profiling tool☆58Updated 2 years ago
- Microsoft Collective Communication Library☆381Updated 2 years ago
- Bamboo is a system for running large pipeline-parallel DNNs affordably, reliably, and efficiently using spot instances.☆55Updated 3 years ago
- An interference-aware scheduler for fine-grained GPU sharing☆159Updated 2 months ago
- ACT An Architectural Carbon Modeling Tool for Designing Sustainable Computer Systems☆45Updated 6 months ago
- RDMA and SHARP plugins for nccl library☆223Updated 3 weeks ago
- NCCL Profiling Kit☆152Updated last year
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆144Updated this week
- Helios Traces from SenseTime☆61Updated 3 years ago
- Unified Collective Communication Library☆291Updated last week
- Artifacts for our NSDI'23 paper TGS☆96Updated last year
- Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…☆27Updated 3 years ago
- ☆304Updated this week
- A tool to detect infrastructure issues on cloud native AI systems☆52Updated 4 months ago
- A hierarchical collective communications library with portable optimizations☆37Updated last year
- ☆24Updated 2 years ago
- This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.☆204Updated this week
- example code for using DC QP for providing RDMA READ and WRITE operations to remote GPU memory☆152Updated last year