HabanaAI / SynapseAI_Core
SynapseAI Core is a reference implementation of the SynapseAI API running on Habana Gaudi
☆38Updated 2 years ago
Alternatives and similar repositories for SynapseAI_Core:
Users that are interested in SynapseAI_Core are comparing it to the libraries listed below
- Provides the examples to write and build Habana custom kernels using the HabanaTools☆19Updated last month
- Bandwidth test for ROCm☆52Updated this week
- ☆99Updated 2 months ago
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆58Updated last month
- Benchmarks to capture important workloads.☆29Updated this week
- oneAPI Collective Communications Library (oneCCL)☆218Updated this week
- An extension library of WMMA API (Tensor Core API)☆87Updated 6 months ago
- ROCm Tracer Callback/Activity Library for Performance tracing AMD GPUs☆78Updated this week
- AMD's graph optimization engine.☆196Updated this week
- oneCCL Bindings for Pytorch*☆87Updated 2 weeks ago
- OpenAI Triton backend for Intel® GPUs☆154Updated this week
- CUDA Templates for Linear Algebra Subroutines☆11Updated this week
- ☆59Updated last month
- ☆48Updated 10 months ago
- RCCL Performance Benchmark Tests☆55Updated this week
- ☆64Updated 2 months ago
- ☆57Updated 7 months ago
- ☆131Updated this week
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆37Updated 8 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆47Updated this week
- A GPU-driven system framework for scalable AI applications☆111Updated 3 months ago
- Machine Intelligence Shader Autogen. AMDGPU ML shader code generator. (previously iGEMMgen)☆34Updated 3 months ago
- An IR for efficiently simulating distributed ML computation.☆25Updated last year
- ROCm BLAS marshalling library☆125Updated this week
- CUDA 12.2 HMM demos☆19Updated 5 months ago
- A Python library transfers PyTorch tensors between CPU and NVMe☆102Updated last month
- Standalone Flash Attention v2 kernel without libtorch dependency☆99Updated 4 months ago
- ☆56Updated 2 weeks ago
- hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditiona…☆73Updated this week
- Benchmark code for the "Online normalizer calculation for softmax" paper☆62Updated 6 years ago