TensorRT encapsulation, learn, rewrite, practice.
☆30Oct 19, 2022Updated 3 years ago
Alternatives and similar repositories for trt_learn
Users that are interested in trt_learn are comparing it to the libraries listed below
Sorting:
- ☆30Nov 16, 2024Updated last year
- async inference for machine learning model☆26Sep 21, 2022Updated 3 years ago
- a plugin-oriented framework for video structured. 国产程序员请加微信zhzhi78拉群交流。☆18May 28, 2024Updated last year
- FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …☆32Dec 21, 2024Updated last year
- A practical way of learning Swizzle☆37Feb 3, 2025Updated last year
- Awesome code, projects, books, etc. related to CUDA☆31Feb 3, 2026Updated last month
- 跟着Tensorrt_pro学习各种知识☆40Nov 25, 2022Updated 3 years ago
- Quick and Self-Contained TensorRT Custom Plugin Implementation and Integration☆81May 26, 2025Updated 9 months ago
- ☆47Mar 27, 2023Updated 2 years ago
- GEMV implementation with CUTLASS☆19Aug 21, 2025Updated 6 months ago
- Kernel Library Wheel for SGLang☆16Updated this week
- Cute layout visualization☆30Jan 18, 2026Updated last month
- 对 tensorRT_Pro 开源项目理解☆22Feb 23, 2023Updated 3 years ago
- Step-by-step optimization of CUDA SGEMM☆432Mar 30, 2022Updated 3 years ago
- ffmpeg+cuvid+tensorrt+multicamera☆12Dec 31, 2024Updated last year
- ☆14Nov 3, 2025Updated 4 months ago
- This is a repository to practice multi-thread programming in C++☆28Feb 21, 2024Updated 2 years ago
- h264的软解和硬解,基于FFmpeg和MPP☆11Mar 23, 2022Updated 3 years ago
- 使用 CUDA C++ 实现的 llama 模型推理框架☆64Nov 8, 2024Updated last year
- ☆32Jul 2, 2025Updated 8 months ago
- ☆21Aug 14, 2024Updated last year
- ☆49Apr 15, 2024Updated last year
- 本仓库在OpenVINO推理框架下部署Nanodet检测算法,并重写预处理和后处理部分,具有超高性能!让你在Intel CPU平台上的检测速度起飞! 并基于NNCF和PPQ工具将模型量化(PTQ)至int8精度,推理速度更快!☆16Jun 14, 2023Updated 2 years ago
- A CUDA kernel for NHWC GroupNorm for PyTorch☆23Nov 15, 2024Updated last year
- Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.☆46Jun 11, 2025Updated 8 months ago
- 使用 cutlass 仓库在 ada 架构上实现 fp8 的 flash attention☆79Aug 12, 2024Updated last year
- ☆19Jan 19, 2024Updated 2 years ago
- 使用 cutlass 实现 flash-attention 精简版,具有教学意义☆58Aug 12, 2024Updated last year
- HunyuanDiT with TensorRT and libtorch☆18May 22, 2024Updated last year
- TensorRT实现BiSeNetV1与BiSeNetV2部署☆20Apr 14, 2022Updated 3 years ago
- rknn inference☆48Mar 7, 2022Updated 3 years ago
- ☆113Mar 11, 2024Updated last year
- 记录yolov5的TensorRT量化及推理代码,经实测可运行于Jetson平台☆20May 11, 2023Updated 2 years ago
- ☆20Jul 20, 2022Updated 3 years ago
- Deep Learning Deployment Framework: Supports tf/torch/trt/trtllm/vllm and other NN frameworks. Support dynamic batching, and streaming mo…☆168May 8, 2025Updated 9 months ago
- ☆20Dec 29, 2023Updated 2 years ago
- 安卓手机部署DeepSeek-R1 蒸馏的1.5B模型☆23Feb 4, 2025Updated last year
- Sample projects for TensorRT in C++☆198Feb 17, 2023Updated 3 years ago
- 高效部署:YOLO X, V3, V4, V5, V6, V7, V8, EdgeYOLO TRT推理 ™️ ,前后处理均由CUDA核函数实现 CPP/CUDA🚀☆53Feb 23, 2023Updated 3 years ago