cap-lab / jediLinks
Jetson embedded platform-target deep learning inference acceleration framework with TensorRT
☆28Updated 3 weeks ago
Alternatives and similar repositories for jedi
Users that are interested in jedi are comparing it to the libraries listed below
Sorting:
- Inference of quantization aware trained networks using TensorRT☆82Updated 2 years ago
- NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.☆200Updated last year
- This is a list of awesome edgeAI inference related papers.☆95Updated last year
- Benchmark scripts for TVM☆74Updated 3 years ago
- ☆36Updated 8 months ago
- ☆69Updated 2 years ago
- Quick and Self-Contained TensorRT Custom Plugin Implementation and Integration☆61Updated last month
- NART = NART is not A RunTime, a deep learning inference framework.☆37Updated 2 years ago
- ☆11Updated 5 months ago
- play gemm with tvm☆91Updated last year
- A Winograd Minimal Filter Implementation in CUDA☆25Updated 3 years ago
- A set of examples around MegEngine☆31Updated last year
- ☆149Updated 2 years ago
- A tutorial for getting started with the Deep Learning Accelerator (DLA) on NVIDIA Jetson☆332Updated 3 years ago
- YOLOv5 on Orin DLA☆204Updated last year
- [MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration☆200Updated 3 years ago
- This repository contains the results and code for the MLPerf™ Inference v2.1 benchmark.☆18Updated 2 years ago
- ☆18Updated 2 weeks ago
- code reading for tvm☆76Updated 3 years ago
- Offline Quantization Tools for Deploy.☆129Updated last year
- Count number of parameters / MACs / FLOPS for ONNX models.☆93Updated 8 months ago
- Code for ACM MobiCom 2024 paper "FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices"☆54Updated 5 months ago
- ☆36Updated 2 years ago
- llama INT4 cuda inference with AWQ☆54Updated 5 months ago
- [CVPRW 2021] Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded Platforms☆29Updated 2 years ago
- ☆19Updated 3 years ago
- A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer☆92Updated 3 weeks ago
- Common libraries for PPL projects☆29Updated 3 months ago
- PyTorch Quantization Aware Training Example☆136Updated last year
- Collection of blogs on AI development☆19Updated 7 months ago