intel / intel-extension-for-deepspeed
Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note XPU is already supported in stock DeepSpeed (upstream).
☆62Updated 3 weeks ago
Alternatives and similar repositories for intel-extension-for-deepspeed:
Users that are interested in intel-extension-for-deepspeed are comparing it to the libraries listed below
- oneCCL Bindings for Pytorch*☆91Updated 2 weeks ago
- ☆37Updated this week
- OpenAI Triton backend for Intel® GPUs☆170Updated this week
- oneAPI Collective Communications Library (oneCCL)☆227Updated this week
- Intel® Tensor Processing Primitives extension for Pytorch*☆12Updated last week
- RCCL Performance Benchmark Tests☆60Updated 2 weeks ago
- ☆61Updated 3 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆62Updated this week
- Microsoft Collective Communication Library☆60Updated 4 months ago
- MSCCL++: A GPU-driven communication stack for scalable AI applications☆315Updated this week
- Benchmarks to capture important workloads.☆30Updated last month
- ☆62Updated last month
- Ahead of Time (AOT) Triton Math Library☆56Updated last week
- An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).☆240Updated 5 months ago
- ☆45Updated last month
- ROCm Communication Collectives Library (RCCL)☆308Updated this week
- A Python library transfers PyTorch tensors between CPU and NVMe☆111Updated 4 months ago
- Experimental projects related to TensorRT☆94Updated this week
- oneAPI Level Zero Conformance & Performance test content☆48Updated this week
- CUDA Templates for Linear Algebra Subroutines☆16Updated this week
- AI Tensor Engine for ROCm☆119Updated this week
- Shared Middle-Layer for Triton Compilation☆233Updated 2 weeks ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆40Updated last week
- ☆73Updated 4 months ago
- Synthesizer for optimal collective communication algorithms☆106Updated 11 months ago
- NCCL Profiling Kit☆128Updated 8 months ago
- Large Language Model Text Generation Inference on Habana Gaudi☆32Updated last week
- ☆25Updated this week
- ☆30Updated 2 years ago
- Assembler for NVIDIA Volta and Turing GPUs☆214Updated 3 years ago