ailzhang / EfficientPyTorch
Code release for book "Efficient Training in PyTorch"
☆19Updated last month
Related projects ⓘ
Alternatives and complementary repositories for EfficientPyTorch
- ☆22Updated 4 years ago
- BGHT: High-performance static GPU hash tables.☆55Updated 2 months ago
- 鉴定网络热门并行编程框架 - 性能测评(附小彭老师锐评)已评测:Taichi、SyCL、C++、OpenMP、TBB、Mojo☆34Updated last year
- This is a demo how to write a high performance convolution run on apple silicon☆52Updated 2 years ago
- TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstraction for processing tiles.☆154Updated this week
- An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.☆49Updated 3 months ago
- μ-Cuda, COVER THE LAST MILE OF CUDA. With features: intellisense-friendly, structured launch, automatic cuda graph generation and updatin…☆151Updated this week
- GPTQ inference TVM kernel☆36Updated 6 months ago
- ☆18Updated last month
- play gemm with tvm☆84Updated last year
- CVFusion is an open-source deep learning compiler to fuse the OpenCV operators.☆26Updated 2 years ago
- study of cutlass☆19Updated last week
- ☆33Updated 5 months ago
- ☆48Updated 8 months ago
- Standalone Flash Attention v2 kernel without libtorch dependency☆98Updated 2 months ago
- TaichiCon: Taichi Conferences☆71Updated 2 years ago
- ☆14Updated 2 years ago
- ☆79Updated last year
- Code for SIGGRAPH 2022 paper "Automatic quantization for physics-based simulation"☆61Updated 2 years ago
- A TVM-like CUDA/C code generator.☆9Updated 2 years ago
- 小彭老师推出 SyCL 2020 课程(施工中,日后会在直播中放出)☆15Updated last year
- A language and compiler for irregular tensor programs.☆133Updated 6 months ago
- DietCode Code Release☆62Updated 2 years ago
- ☆70Updated last year
- Examples of CUDA implementations by Cutlass CuTe☆98Updated last week
- High Performance Grouped GEMM in PyTorch☆22Updated 2 years ago
- ☆16Updated this week
- FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme☆46Updated 2 months ago
- ☆35Updated 2 weeks ago