NVIDIA / tao_deploy
Package for deploying deep learning models from TAO Toolkit
☆16Updated 3 weeks ago
Related projects: ⓘ
- ☆56Updated last year
- Quick start scripts and tutorial notebooks to get started with TAO Toolkit☆35Updated 3 weeks ago
- A tool convert TensorRT engine/plan to a fake onnx☆37Updated last year
- TAO Toolkit deep learning networks with PyTorch backend☆81Updated 3 weeks ago
- ☆77Updated this week
- ☆13Updated 5 months ago
- CUda Matrix Multiply library.☆67Updated 2 weeks ago
- DeltaCNN End-to-End CNN Inference of Sparse Frame Differences in Videos☆60Updated last year
- This repository describes how to add a custom TensorRT plugin in c++ and python☆25Updated 3 years ago
- Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption☆87Updated last year
- Datasets, Transforms and Models specific to Computer Vision☆82Updated 10 months ago
- ☆37Updated 2 months ago
- implement minimal pytorch from scratch☆18Updated 3 years ago
- Collection of blogs on AI development☆13Updated last month
- ☆134Updated last year
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Updated last year
- [CVPR-2023] Towards Any Structural Pruning☆17Updated last year
- ☆30Updated 3 months ago
- A neural network training interface based on PyTorch, with a focus on flexibility☆61Updated 8 months ago
- [NeurIPS 2023] MCUFormer: Deploying Vision Transformers on Microcontrollers with Limited Memory☆62Updated 10 months ago
- Patch convolution to avoid large GPU memory usage of Conv2D☆73Updated 3 months ago
- ☆27Updated last year
- TAO Toolkit deep learning networks with TensorFlow 1.x backend☆10Updated 7 months ago
- ☆11Updated last year
- Standalone Flash Attention v2 kernel without libtorch dependency☆93Updated last week
- study of cutlass☆18Updated last year
- sparse convolution lib. derived from spconv☆53Updated 3 years ago
- This repository contains the results and code for the MLPerf™ Inference v2.1 benchmark.☆18Updated last year
- NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.☆174Updated 3 months ago
- An object detection codebase based on MegEngine.☆28Updated last year