dianhsu / swin-transformer-cpp
Swin Transformer C++ Implementation
☆63Updated 3 years ago
Alternatives and similar repositories for swin-transformer-cpp:
Users that are interested in swin-transformer-cpp are comparing it to the libraries listed below
- 用C++实现一个简单的Transformer模型。 Attention Is All You Need。☆50Updated 4 years ago
- A Winograd Minimal Filter Implementation in CUDA☆24Updated 3 years ago
- play gemm with tvm☆90Updated last year
- CUDA Templates for Linear Algebra Subroutines☆98Updated last year
- CPU Memory Compiler and Parallel programing☆26Updated 5 months ago
- Code for ACM MobiCom 2024 paper "FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices"☆53Updated 3 months ago
- ResNet Implementation, Training, and Inference Using LibTorch C++ API☆40Updated 10 months ago
- ☆30Updated last year
- ☆36Updated 6 months ago
- CUDA 6大并行计算模式 代码与笔记☆60Updated 4 years ago
- how to design cpu gemm on x86 with avx256, that can beat openblas.☆70Updated 6 years ago
- CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API☆30Updated last year
- Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.☆61Updated 7 months ago
- ☆17Updated last year
- Common libraries for PPL projects☆29Updated last month
- Examples of CUDA implementations by Cutlass CuTe☆159Updated 2 months ago
- ☆30Updated 2 years ago
- ☆96Updated 3 years ago
- Benchmark code for the "Online normalizer calculation for softmax" paper☆91Updated 6 years ago
- ☆109Updated last year
- CUDA Matrix Multiplication Optimization☆181Updated 9 months ago
- ☆143Updated 2 years ago
- A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer☆91Updated 3 weeks ago
- A tutorial for CUDA&PyTorch☆137Updated 3 months ago
- PyTorch Quantization Aware Training Example☆135Updated 11 months ago
- Manually implemented quantization-aware training☆21Updated 2 years ago
- code reading for tvm☆76Updated 3 years ago
- ☆20Updated 4 years ago
- 使用 CUDA C++ 实现的 llama 模型推理框架☆50Updated 5 months ago
- ☆122Updated last year