HabanaAI / SynapseAI_Core
SynapseAI Core is a reference implementation of the SynapseAI API running on Habana Gaudi
☆40Updated 3 months ago
Alternatives and similar repositories for SynapseAI_Core
Users that are interested in SynapseAI_Core are comparing it to the libraries listed below
Sorting:
- oneCCL Bindings for Pytorch*☆97Updated 2 weeks ago
- Benchmarks to capture important workloads.☆31Updated 3 months ago
- Bandwidth test for ROCm☆55Updated last week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆61Updated 2 months ago
- oneAPI Collective Communications Library (oneCCL)☆232Updated last week
- ☆50Updated last year
- MLIR-based partitioning system☆82Updated this week
- Provides the examples to write and build Habana custom kernels using the HabanaTools☆21Updated last month
- ☆60Updated 4 months ago
- OpenAI Triton backend for Intel® GPUs☆184Updated this week
- ☆106Updated last month
- GEMM and Winograd based convolutions using CUTLASS☆26Updated 4 years ago
- Memory Optimizations for Deep Learning (ICML 2023)☆64Updated last year
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆40Updated last month
- RCCL Performance Benchmark Tests☆64Updated this week
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆84Updated this week
- A Python library transfers PyTorch tensors between CPU and NVMe☆115Updated 5 months ago
- A CUTLASS implementation using SYCL☆21Updated this week
- Standalone Flash Attention v2 kernel without libtorch dependency☆108Updated 8 months ago
- Computation using data flow graphs for scalable machine learning☆67Updated this week
- ☆68Updated last month
- ☆79Updated 6 months ago
- Fast sparse deep learning on CPUs☆53Updated 2 years ago
- Unified compiler/runtime for interfacing with PyTorch Dynamo.☆100Updated 2 months ago
- ☆45Updated 10 months ago
- AMD ROCm Performance Primitives (RPP) library is a comprehensive high-performance computer vision library for AMD processors with HIP/Ope…☆62Updated last week
- Library for modelling performance costs of different Neural Network workloads on NPU devices☆33Updated last week
- Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …☆65Updated 3 years ago
- An extension library of WMMA API (Tensor Core API)☆96Updated 10 months ago
- Reference models for Intel(R) Gaudi(R) AI Accelerator☆162Updated 2 weeks ago