Tiramisu-Compiler / tiramisu_pytorch
Integration of Tiramisu (Compiler) into PyTorch
☆26Updated 4 years ago
Related projects: ⓘ
- HW-PR-NAS is a single surrogate model trained to Pareto rank the architectures based on Accuracy, Latency and energy consumption☆11Updated last year
- Hybrid Tiny Hardware-aware Neural Architecture Search☆15Updated 2 years ago
- A self-contained version of the tutorial which can be easily cloned and viewed by others.☆26Updated 5 years ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆30Updated 4 months ago
- A Winograd Minimal Filter Implementation in CUDA☆20Updated 3 years ago
- Tensors and Dynamic neural networks in Python with strong GPU acceleration☆26Updated last year
- GEMM and Winograd based convolutions using CUTLASS☆24Updated 4 years ago
- Stores documents and resources used by the OpenXLA developer community☆105Updated last month
- ☆66Updated this week
- Fast sparse deep learning on CPUs☆51Updated last year
- Conversions to MLIR EmitC☆120Updated 3 weeks ago
- ☆48Updated 6 months ago
- A Data-Centric Compiler for Machine Learning☆81Updated 8 months ago
- A plugin for Jupyter Notebook to run CUDA C/C++ code☆190Updated last week
- Test suite for probing the numerical behavior of NVIDIA tensor cores☆29Updated last month
- Poplar libraries☆114Updated 11 months ago
- NNCG: A Neural Network Code Generator☆33Updated last month
- Simple neural network implementation using CUDA technology. It is an educational implementation.☆92Updated 6 years ago
- ☆11Updated 2 years ago
- Unified compiler/runtime for interfacing with PyTorch Dynamo.☆90Updated this week
- Customized matrix multiplication kernels☆53Updated 2 years ago
- QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX☆121Updated this week
- A sandbox for quick iteration and experimentation on projects related to IREE, MLIR, and LLVM☆52Updated last week
- MLIRX is now defunct. Please see PolyBlocks - https://docs.polymagelabs.com☆38Updated 9 months ago
- ☆137Updated 3 months ago
- Open source cross-platform compiler for compute-intensive loops used in AI algorithms, from Microsoft Research☆95Updated 11 months ago
- Training neural networks in TensorFlow 2.0 with 5x less memory☆127Updated 2 years ago
- PyTorch interface for the IPU☆176Updated 11 months ago
- Research and development for optimizing transformers☆121Updated 3 years ago
- ☆43Updated last month