microsoft / DirectML
DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.
☆2,172Updated last week
Related projects: ⓘ
- Fork of TensorFlow accelerated by DirectML☆457Updated last year
- A Python package for extending the official PyTorch that can easily obtain performance on Intel platform☆1,554Updated this week
- Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.☆1,524Updated this week
- AMD ROCm™ Software - GitHub Home☆4,493Updated this week
- DirectML PluggableDevice plugin for TensorFlow 2☆184Updated 3 months ago
- SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX R…☆2,152Updated this week
- oneAPI Deep Neural Network Library (oneDNN)☆3,579Updated this week
- AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (N…☆4,531Updated last week
- Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX☆2,296Updated 2 weeks ago
- Build and run containers leveraging NVIDIA GPUs☆2,219Updated this week
- Simple, safe way to store and distribute tensors☆2,755Updated 2 weeks ago
- PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT☆2,499Updated this week
- Examples for using ONNX Runtime for machine learning inferencing.☆1,144Updated 2 weeks ago
- Intel® Extension for TensorFlow*☆314Updated 2 weeks ago
- Dockerfiles for the various software layers defined in the ROCm software platform☆420Updated last month
- A machine learning compiler for GPUs, CPUs, and ML accelerators☆2,580Updated this week
- High-efficiency floating-point neural network inference operators for mobile, server, and Web☆1,812Updated this week
- OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference☆6,853Updated this week
- ONNXMLTools enables conversion of models to ONNX☆992Updated 3 months ago
- SHARK - High Performance Machine Learning Distribution☆1,413Updated last month
- AMD's Machine Intelligence Library☆1,051Updated this week
- CUDA on ??? GPUs☆8,941Updated this week
- Development repository for the Triton language and compiler☆12,744Updated this week
- HIP: C++ Heterogeneous-Compute Interface for Portability☆3,690Updated this week
- Hackable and optimized Transformers building blocks, supporting a composable construction.☆8,351Updated last week
- An Open Source Machine Learning Framework for Everyone☆969Updated last month
- CUDA Python Low-level Bindings☆850Updated 2 weeks ago
- A retargetable MLIR-based machine learning compiler and runtime toolkit.☆2,559Updated this week
- NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source compone…☆10,559Updated this week
- Generative AI extensions for onnxruntime☆421Updated this week