Tiramisu-Compiler / tiramisu_pytorchLinks
Integration of Tiramisu (Compiler) into PyTorch
☆25Updated 5 years ago
Alternatives and similar repositories for tiramisu_pytorch
Users that are interested in tiramisu_pytorch are comparing it to the libraries listed below
Sorting:
- A self-contained version of the tutorial which can be easily cloned and viewed by others.☆24Updated 6 years ago
- Poplar libraries☆119Updated last year
- Test suite for probing the numerical behavior of NVIDIA tensor cores☆40Updated 11 months ago
- TVM stack: exploring the incredible explosion of deep-learning frameworks and how to bring them together☆64Updated 7 years ago
- ☆18Updated 5 years ago
- CUDA templates for tile-sparse matrix multiplication based on CUTLASS.☆51Updated 7 years ago
- Mille Crepe Bench: layer-wise performance analysis for deep learning frameworks.☆17Updated 5 years ago
- Poplar Advanced Runtime for the IPU☆7Updated last year
- parser script to process pytorch autograd profiler result, convert json file to excel.☆14Updated 5 years ago
- ☆104Updated last year
- ☆23Updated 7 months ago
- Build TVM docker image for production compilation deployments☆12Updated 3 years ago
- A Data-Centric Compiler for Machine Learning☆84Updated last year
- Training material for IPU users: tutorials, feature examples, simple applications☆86Updated 2 years ago
- A polyhedral compiler for expressing fast and portable data parallel algorithms☆944Updated 8 months ago
- PyTorch interface for the IPU☆180Updated last year
- Tensors and Dynamic neural networks in Python with strong GPU acceleration☆26Updated 2 years ago
- GEMM and Winograd based convolutions using CUTLASS☆26Updated 5 years ago
- A Deep Learning Meta-Framework and HPC Benchmarking Library☆81Updated 3 years ago
- Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation☆27Updated 5 years ago
- MLIRX is now defunct. Please see PolyBlocks - https://docs.polymagelabs.com☆38Updated last year
- Memory Optimizations for Deep Learning (ICML 2023)☆98Updated last year
- Instructions, Docker images, and examples for Nsight Compute and Nsight Systems☆132Updated 5 years ago
- ☆35Updated 5 years ago
- Issues related to MLPerf™ training policies, including rules and suggested changes☆95Updated this week
- MLIR-based partitioning system☆105Updated this week
- This is the implementation for paper: AdaTune: Adaptive Tensor Program CompilationMade Efficient (NeurIPS 2020).☆14Updated 4 years ago
- Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)☆138Updated 4 years ago
- GPU Performance Advisor☆65Updated 2 years ago
- Training neural networks in TensorFlow 2.0 with 5x less memory☆132Updated 3 years ago