NVIDIA / TensorRT-Edge-LLMLinks
High-performance, light-weight C++ LLM and VLM Inference Software for Physical AI
☆227Updated last month
Alternatives and similar repositories for TensorRT-Edge-LLM
Users that are interested in TensorRT-Edge-LLM are comparing it to the libraries listed below
Sorting:
- Quick and Self-Contained TensorRT Custom Plugin Implementation and Integration☆79Updated 8 months ago
- NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.☆225Updated last year
- A simple tool that can generate TensorRT plugin code quickly.☆239Updated 2 years ago
- YOLOv5 on Orin DLA☆221Updated last year
- A tutorial for getting started with the Deep Learning Accelerator (DLA) on NVIDIA Jetson☆364Updated 3 years ago
- Collection of blogs on AI development☆23Updated last year
- A tutorial for CUDA&PyTorch☆227Updated last week
- ☆314Updated 3 years ago
- TensorRT Plugin Autogen Tool☆366Updated 2 years ago
- Deep insight tensorrt, including but not limited to qat, ptq, plugin, triton_inference, cuda☆23Updated 3 weeks ago
- TensorRT 7 C++ (almost) minimal examples☆84Updated 2 years ago
- This is a repository to practice multi-thread programming in C++☆27Updated last year
- ☆38Updated last year
- Offline Quantization Tools for Deploy.☆142Updated 2 years ago
- 该代码与B站上的视频 https://www.bilibili.com/video/BV18L41197Uz/?spm_id_from=333.788&vd_source=eefa4b6e337f16d87d87c2c357db8ca7 相关联。☆71Updated 2 years ago
- ☆60Updated last year
- High Performance LLM Inference Operator Library☆695Updated this week
- Llama3 Streaming Chat Sample☆22Updated last year
- A large number of cuda/tensorrt cases . 大量案例来学习cuda/tensorrt☆170Updated 3 years ago
- BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).☆554Updated 2 years ago
- Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function ind…☆108Updated last year
- A parser, editor and profiler tool for ONNX models.☆478Updated 3 months ago
- Serving Inside Pytorch☆170Updated 2 weeks ago
- ☆26Updated 5 months ago
- ☆152Updated last year
- ☆145Updated last year
- A light llama-like llm inference framework based on the triton kernel.☆171Updated last month
- Deep Learning tools and applications for NVIDIA AGX platforms.☆266Updated last week
- ☆45Updated 3 years ago
- ☆43Updated 4 years ago