aieater / rocm_pytorch_informationsLinks
The official page of ROCm/PyTorch will contain information that is always confusing. On this page we will endeavor to describe accurate information based on the knowledge gained by GPUEater infrastructure development.
☆87Updated 4 years ago
Alternatives and similar repositories for rocm_pytorch_informations
Users that are interested in rocm_pytorch_informations are comparing it to the libraries listed below
Sorting:
- 3X speedup over Apple’s TensorFlow plugin by using Apache TVM on M1☆136Updated 3 years ago
- Tensors and Dynamic neural networks in Python with strong GPU acceleration☆237Updated this week
- ☆74Updated last year
- Nod.ai 🦈 version of 👻 . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository …☆106Updated 7 months ago
- Memory Efficient Attention (O(sqrt(n)) for Jax and PyTorch☆184Updated 2 years ago
- Lite Inference Toolkit (LIT) for PyTorch☆161Updated 3 years ago
- Accelerate PyTorch models with ONNX Runtime☆364Updated 5 months ago
- DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight …☆236Updated 2 years ago
- Haste: a fast, simple, and open RNN library☆332Updated 2 years ago
- PyTorch interface for the IPU☆180Updated last year
- Tensorflow Wheels☆135Updated 3 years ago
- EfficientNet, MobileNetV3, MobileNetV2, MixNet, etc in JAX w/ Flax Linen and Objax☆128Updated last year
- Make TFRecord Usable Again☆88Updated 2 years ago
- Productionize machine learning predictions, with ONNX or without☆65Updated last year
- Using the famous cnn model in Pytorch, we run benchmarks on various gpu.☆240Updated last year
- Lightweight machine learning library based on OpenCL 1.2☆75Updated 4 years ago
- Torch Distributed Experimental☆117Updated last year
- Fast Block Sparse Matrices for Pytorch☆548Updated 4 years ago
- Python Research Framework☆106Updated 2 years ago
- PyTorch implementation of L2L execution algorithm☆107Updated 2 years ago
- Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.☆240Updated 3 years ago
- ADAS is short for Adaptive Step Size, it's an optimizer that unlike other optimizers that just normalize the derivative, it fine-tunes th…☆85Updated 4 years ago
- Library for 8-bit optimizers and quantization routines.☆773Updated 2 years ago
- [Prototype] Tools for the concurrent manipulation of variably sized Tensors.☆251Updated 2 years ago
- Customized matrix multiplication kernels☆56Updated 3 years ago
- ☆39Updated 2 years ago
- torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters i…☆180Updated last month
- GPU implementation of a fast generalized ANS (asymmetric numeral system) entropy encoder and decoder, with extensions for lossless compre…☆346Updated last month
- GPU fan control for headless Linux☆345Updated 2 years ago
- HetSeq: Distributed GPU Training on Heterogeneous Infrastructure☆106Updated 2 years ago