aieater / rocm_pytorch_informations
The official page of ROCm/PyTorch will contain information that is always confusing. On this page we will endeavor to describe accurate information based on the knowledge gained by GPUEater infrastructure development.
β87Updated 4 years ago
Alternatives and similar repositories for rocm_pytorch_informations:
Users that are interested in rocm_pytorch_informations are comparing it to the libraries listed below
- Tensors and Dynamic neural networks in Python with strong GPU accelerationβ223Updated this week
- Nod.ai π¦ version of π» . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository β¦β106Updated 3 months ago
- Fast Block Sparse Matrices for Pytorchβ545Updated 4 years ago
- 3X speedup over Appleβs TensorFlow plugin by using Apache TVM on M1β136Updated 3 years ago
- Mish Activation Function for PyTorchβ148Updated 4 years ago
- Code for scaling Transformersβ26Updated 4 years ago
- [JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Championβ40Updated 4 years ago
- PyTorch implementation of L2L execution algorithmβ107Updated 2 years ago
- Using the famous cnn model in Pytorch, we run benchmarks on various gpu.β234Updated 10 months ago
- Accelerate PyTorch models with ONNX Runtimeβ359Updated 2 months ago
- [Prototype] Tools for the concurrent manipulation of variably sized Tensors.β251Updated 2 years ago
- β39Updated 2 years ago
- PyProf2: PyTorch Profiling toolβ82Updated 4 years ago
- Implementation of https://arxiv.org/abs/1904.00962β374Updated 4 years ago
- Large Model Support in PyTorchβ133Updated 3 years ago
- Deep Learning Primitives and Mini-Framework for OpenCLβ193Updated 7 months ago
- PyTorch dataset extended with map, cache etc. (tensorflow.data like)β329Updated 2 years ago
- Lite Inference Toolkit (LIT) for PyTorchβ161Updated 3 years ago
- β87Updated 2 years ago
- DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight β¦β235Updated last year
- Customized matrix multiplication kernelsβ54Updated 3 years ago
- int8_t and int16_t matrix multiply based on https://arxiv.org/abs/1705.01991β71Updated last year
- β109Updated 4 years ago
- Tensor Shape Annotation Library (numpy, tensorflow, pytorch, ...)β265Updated 4 years ago
- β74Updated last year
- Library for 8-bit optimizers and quantization routines.β716Updated 2 years ago
- nGraphβ’ Backend for ONNXβ42Updated 2 years ago
- a mini Deep Learning framework supporting GPU accelerations written with CUDAβ32Updated 4 years ago
- NVIDIA GPU tools - monitoring on CLI & web app with multiple agentsβ87Updated 11 months ago
- Example repository for custom C++/CUDA operators for TorchScriptβ114Updated 2 years ago