octoml / public-tvm-docker
Build TVM docker image for production compilation deployments
☆13Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for public-tvm-docker
- Yet another Polyhedra Compiler for DeepLearning☆19Updated last year
- Benchmark scripts for TVM☆73Updated 2 years ago
- tophub autotvm log collections☆70Updated last year
- ☆67Updated last year
- Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation☆26Updated 5 years ago
- Benchmark of TVM quantized model on CUDA☆112Updated 4 years ago
- modified cutlass☆14Updated 4 years ago
- Codebase associated with the PyTorch compiler tutorial☆44Updated 5 years ago
- ☆26Updated last year
- TVM stack: exploring the incredible explosion of deep-learning frameworks and how to bring them together☆63Updated 6 years ago
- Issues related to MLPerf™ Inference policies, including rules and suggested changes☆57Updated this week
- Benchmark code for the "Online normalizer calculation for softmax" paper☆59Updated 6 years ago
- GEMM and Winograd based convolutions using CUTLASS☆25Updated 4 years ago
- This is the implementation for paper: AdaTune: Adaptive Tensor Program CompilationMade Efficient (NeurIPS 2020).☆13Updated 3 years ago
- ☆48Updated 8 months ago
- Benchmark PyTorch Custom Operators☆13Updated last year
- Test winograd convolution written in TVM for CUDA and AMDGPU☆40Updated 6 years ago
- ☆23Updated 8 months ago
- ICML2017 MEC: Memory-efficient Convolution for Deep Neural Network C++实现(非官方)☆17Updated 5 years ago
- This is a demo how to write a high performance convolution run on apple silicon☆52Updated 2 years ago
- System for automated integration of deep learning backends.☆48Updated 2 years ago
- DLPack for Tensorflow☆36Updated 4 years ago
- An external memory allocator example for PyTorch.☆13Updated 3 years ago
- ☆38Updated 4 years ago
- Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …☆63Updated 2 years ago
- A self-contained version of the tutorial which can be easily cloned and viewed by others.☆24Updated 5 years ago
- Accelerating CNN's convolution operation on GPUs by using memory-efficient data access patterns.☆14Updated 6 years ago
- ☆18Updated 2 years ago
- Sandbox for TVM and playing around!☆22Updated last year