amd / ZenDNN-tensorflow-plugin
☆8Updated last month
Alternatives and similar repositories for ZenDNN-tensorflow-plugin
Users that are interested in ZenDNN-tensorflow-plugin are comparing it to the libraries listed below
Sorting:
- TransferBench is a utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs)☆39Updated this week
- Magnum IO community repo☆91Updated 3 months ago
- A CUTLASS implementation using SYCL☆21Updated this week
- ☆20Updated last month
- Tenstorrent Firmware Update Utility☆13Updated this week
- ROCm Documentation Python package for ReadTheDocs build standardization☆16Updated this week
- Mille Crepe Bench: layer-wise performance analysis for deep learning frameworks.☆17Updated 5 years ago
- ☆20Updated this week
- A tool to detect infrastructure issues on cloud native AI systems☆35Updated this week
- AMD SMI☆62Updated this week
- ☆32Updated this week
- Slides and exercises for persistent memory programming tutorial☆13Updated 2 years ago
- Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs☆11Updated last month
- Intel® SHMEM - Device initiated shared memory based communication library☆23Updated last month
- A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch☆22Updated this week
- HIP Python Low-level Bindings☆25Updated 3 weeks ago
- Linux based user-space RSHIM driver for the Mellanox BlueField SoC☆31Updated this week
- RCCL Performance Benchmark Tests☆64Updated this week
- High-Performance Linpack Benchmark adopted version for GPU backend☆11Updated 2 years ago
- Automated machine learning as an AI-HPC benchmark☆66Updated 2 years ago
- A hierarchical collective communications library with portable optimizations☆35Updated 5 months ago
- ☆58Updated this week
- The AMD rocAL is designed to efficiently decode and process images and videos from a variety of storage formats and modify them through a…☆17Updated this week
- RDC☆29Updated this week
- Provides a set of benchmarks that can be used to measure the memory bandwidth performance of CPU's☆89Updated last year
- Tenstorrent Firmware repository☆13Updated this week
- Ongoing research training transformer models at scale☆20Updated this week
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆82Updated this week
- ☆36Updated this week
- MPI Microbenchmarks☆39Updated 9 years ago