HabanaAI / DeepSpeedLinks

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

☆13

Alternatives and similar repositories for DeepSpeed

Users that are interested in DeepSpeed are comparing it to the libraries listed below

Sorting:

intel / torch-ccl
oneCCL Bindings for Pytorch*
☆99Updated this week
intel / intel-extension-for-deepspeed
Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…
☆61Updated last month
intel / cutlass-sycl
A CUTLASS implementation using SYCL
☆32Updated this week
intel / xetla
☆62Updated 7 months ago
HabanaAI / vllm-fork
A high-throughput and memory-efficient inference and serving engine for LLMs
☆78Updated this week
intel / torch-xpu-ops
☆50Updated this week
uxlfoundation / oneCCL
oneAPI Collective Communications Library (oneCCL)
☆241Updated this week
libxsmm / tpp-pytorch-extension
Intel® Tensor Processing Primitives extension for Pytorch*
☆17Updated last week
NVIDIA / Fuser
A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
☆346Updated this week
ROCm / triton
Development repository for the Triton language and compiler
☆127Updated this week
ROCm / rccl-tests
RCCL Performance Benchmark Tests
☆71Updated last week
ROCm / composable_kernel
Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
☆443Updated this week
intel / pti-gpu
Profiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysi…
☆234Updated this week
intel / intel-xpu-backend-for-triton
OpenAI Triton backend for Intel® GPUs
☆197Updated this week
NVIDIA / nsight-training
Training material for Nsight developer tools
☆163Updated last year
intel / llvm-test-suite
☆20Updated 2 years ago
oneapi-src / level-zero-tests
oneAPI Level Zero Conformance & Performance test content
☆55Updated this week
HabanaAI / Model-References
Reference models for Intel(R) Gaudi(R) AI Accelerator
☆167Updated 2 weeks ago
HabanaAI / gaudi-pytorch-bridge
☆17Updated last week
ROCm / rocSHMEM
rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.
☆98Updated this week
ROCm / hipBLASLt
[DEPRECATED] Moved to ROCm/rocm-libraries repo
☆111Updated last week
ROCm / rccl
ROCm Communication Collectives Library (RCCL)
☆355Updated last week
wmmae / wmma_extension
An extension library of WMMA API (Tensor Core API)
☆99Updated last year
HabanaAI / Habana_Custom_Kernel
Provides the examples to write and build Habana custom kernels using the HabanaTools
☆22Updated 3 months ago
oneapi-src / oneAPI-tab
oneAPI Technical Advisory Board (TAB) Meeting Notes
☆72Updated last year
north-numerical-computing / tensor-cores-numerical-behavior
Test suite for probing the numerical behavior of NVIDIA tensor cores
☆40Updated last year
oneapi-src / level-zero-spec
☆20Updated 3 months ago
oneapi-src / SYCLomatic
☆264Updated last week
openxla / community
Stores documents and resources used by the OpenXLA developer community
☆126Updated last year
ROCm / MAD
MAD (Model Automation and Dashboarding)
☆23Updated last week