HabanaAI / SynapseAI_Core
SynapseAI Core is a reference implementation of the SynapseAI API running on Habana Gaudi
☆37Updated last year
Related projects ⓘ
Alternatives and complementary repositories for SynapseAI_Core
- oneCCL Bindings for Pytorch*☆86Updated 3 weeks ago
- oneAPI Collective Communications Library (oneCCL)☆206Updated this week
- Provides the examples to write and build Habana custom kernels using the HabanaTools☆18Updated 2 weeks ago
- ROCm Tracer Callback/Activity Library for Performance tracing AMD GPUs☆75Updated last week
- OpenAI Triton backend for Intel® GPUs☆143Updated this week
- ☆88Updated this week
- Bandwidth test for ROCm☆47Updated 2 weeks ago
- RCCL Performance Benchmark Tests☆50Updated 3 weeks ago
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆57Updated 2 months ago
- AMD's graph optimization engine.☆186Updated this week
- hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditiona…☆63Updated this week
- ☆128Updated this week
- Benchmarks to capture important workloads.☆28Updated 5 months ago
- ☆29Updated this week
- ☆43Updated 5 months ago
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆11Updated last month
- ☆58Updated last year
- TransferBench is a utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs)☆35Updated this week
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆22Updated last month
- ☆44Updated last week
- An extension library of WMMA API (Tensor Core API)☆84Updated 4 months ago
- AMD’s C++ library for accelerating tensor primitives☆35Updated this week
- ROCm BLAS marshalling library☆118Updated this week
- oneAPI Level Zero Conformance & Performance test content☆46Updated this week
- ☆59Updated this week
- A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators☆67Updated 10 months ago
- RAND library for HIP programming language☆111Updated this week
- Stretching GPU performance for GEMMs and tensor contractions.☆223Updated this week
- ☆38Updated this week
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆35Updated 6 months ago