cap-lab / jediLinks

Jetson embedded platform-target deep learning inference acceleration framework with TensorRT

☆28

Alternatives and similar repositories for jedi

Users that are interested in jedi are comparing it to the libraries listed below

Sorting:

NVIDIA / sampleQAT
Inference of quantization aware trained networks using TensorRT
☆82Updated 2 years ago
NVIDIA / Deep-Learning-Accelerator-SW
NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.
☆200Updated last year
Kyrie-Zhao / awesome-real-time-AI
This is a list of awesome edgeAI inference related papers.
☆95Updated last year
tlc-pack / TLCBench
Benchmark scripts for TVM
☆74Updated 3 years ago
OpenPPL / ppl.kernel.cuda
☆36Updated 8 months ago
masahi / torchscript-to-tvm
☆69Updated 2 years ago
leimao / TensorRT-Custom-Plugin-Example
Quick and Self-Contained TensorRT Custom Plugin Implementation and Integration
☆61Updated last month
ModelTC / NART
NART = NART is not A RunTime, a deep learning inference framework.
☆37Updated 2 years ago
ModelTC / quant_horizon
☆11Updated 5 months ago
LeiWang1999 / tvm_gpu_gemm
play gemm with tvm
☆91Updated last year
UDC-GAC / openCNN
A Winograd Minimal Filter Implementation in CUDA
☆25Updated 3 years ago
MegEngine / examples
A set of examples around MegEngine
☆31Updated last year
Qualcomm-AI-research / FP8-quantization
☆149Updated 2 years ago
NVIDIA-AI-IOT / jetson_dla_tutorial
A tutorial for getting started with the Deep Learning Accelerator (DLA) on NVIDIA Jetson
☆332Updated 3 years ago
NVIDIA-AI-IOT / cuDLA-samples
YOLOv5 on Orin DLA
☆204Updated last year
mit-han-lab / inter-operator-scheduler
[MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration
☆200Updated 3 years ago
mlcommons / inference_results_v2.1
This repository contains the results and code for the MLPerf™ Inference v2.1 benchmark.
☆18Updated 2 years ago
xinetzone / tvm-book
☆18Updated 2 weeks ago
Archermmt / tvm_walk_through
code reading for tvm
☆76Updated 3 years ago
ModelTC / Dipoorlet
Offline Quantization Tools for Deploy.
☆129Updated last year
gmalivenko / onnx-opcounter
Count number of parameters / MACs / FLOPS for ONNX models.
☆93Updated 8 months ago
xxxxyu / FlexNN
Code for ACM MobiCom 2024 paper "FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices"
☆54Updated 5 months ago
ModelTC / NNLQP
☆36Updated 2 years ago
ankan-ban / llama_cu_awq
llama INT4 cuda inference with AWQ
☆54Updated 5 months ago
UoS-EEC / DynamicOFA
[CVPRW 2021] Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded Platforms
☆29Updated 2 years ago
UbiquitousLearning / MobileDLFrameworksBenchmark
☆19Updated 3 years ago
tlc-pack / cutlass_fpA_intB_gemm
A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer
☆92Updated 3 weeks ago
OpenPPL / ppl.common
Common libraries for PPL projects
☆29Updated 3 months ago
leimao / PyTorch-Quantization-Aware-Training
PyTorch Quantization Aware Training Example
☆136Updated last year
HeKun-NVIDIA / AI-Blog
Collection of blogs on AI development
☆19Updated 7 months ago