cap-lab / jediLinks
Jetson embedded platform-target deep learning inference acceleration framework with TensorRT
☆29Updated last month
Alternatives and similar repositories for jedi
Users that are interested in jedi are comparing it to the libraries listed below
Sorting:
- Inference of quantization aware trained networks using TensorRT☆83Updated 2 years ago
- NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.☆220Updated last year
- Count number of parameters / MACs / FLOPS for ONNX models.☆95Updated last year
- A set of examples around MegEngine☆31Updated last year
- Code for ACM MobiCom 2024 paper "FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices"☆56Updated 10 months ago
- Offline Quantization Tools for Deploy.☆141Updated last year
- [CVPRW 2021] Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded Platforms☆30Updated 3 years ago
- ☆17Updated 5 years ago
- ☆166Updated 2 years ago
- PyTorch Quantization Aware Training Example☆146Updated last year
- llama INT4 cuda inference with AWQ☆55Updated 10 months ago
- Benchmark scripts for TVM☆74Updated 3 years ago
- ☆68Updated 2 years ago
- ☆36Updated 3 years ago
- ☆38Updated last year
- ☆18Updated this week
- Tencent Distribution of TVM☆15Updated 2 years ago
- MegEngine到其他框架的转换器☆70Updated 2 years ago
- ☆36Updated 2 years ago
- This repository contains the results and code for the MLPerf™ Inference v1.0 benchmark.☆32Updated 4 months ago
- CVFusion is an open-source deep learning compiler to fuse the OpenCV operators.☆32Updated 3 years ago
- A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer☆96Updated 2 months ago
- Common libraries for PPL projects☆30Updated 8 months ago
- A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.☆360Updated last year
- FakeQuantize with Learned Step Size(LSQ+) as Observer in PyTorch☆36Updated 3 years ago
- ☆60Updated last year
- ☆11Updated 10 months ago
- ☆98Updated 4 years ago
- ☆24Updated 2 years ago
- A Winograd Minimal Filter Implementation in CUDA☆28Updated 4 years ago