thb1314/tensorrt-layernorm-plugin

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/thb1314/tensorrt-layernorm-plugin)

thb1314 / tensorrt-layernorm-plugin

☆27

Alternatives and similar repositories for tensorrt-layernorm-plugin

Users that are interested in tensorrt-layernorm-plugin are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lix19937 / tensorrt-insight
View on GitHub
Deep insight tensorrt, including but not limited to qat, ptq, plugin, triton_inference, cuda
☆24Jul 2, 2026Updated 3 weeks ago
luliyucoordinate / flash-attention-minimal
View on GitHub
Flash Attention in ~100 lines of CUDA (forward pass only)
☆12Jun 10, 2024Updated 2 years ago
GalinaRejoice / learning-cuda-trt
View on GitHub
☆12Jan 25, 2023Updated 3 years ago
qdLMF / LightGlue-with-FlashAttentionV2-TensorRT
View on GitHub
A cutlass cute implementation of headdim-64 flashattentionv2 TensorRT plugin for LightGlue. Run on Jetson Orin NX 8GB with TensorRT 8.5.…
☆20Mar 3, 2025Updated last year
emre570 / transformer.cu
View on GitHub
Transformer Architecture written with CUDA, C++ and LibTorch.
☆11Jul 26, 2025Updated 11 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
DD-DuDa / TensorRT-in-Action
View on GitHub
TensorRT-in-Action 是一个 GitHub 代码库，提供了使用 TensorRT 的代码示例，并有对应 Jupyter Notebook。
☆15Jun 1, 2023Updated 3 years ago
postmalloc / skeletonide
View on GitHub
Skeletonide is a parallel implementation of Zhang-Suen morphological thinning algorithm written in Halide-lang. Use it for fast skeletoni…
☆14Oct 21, 2020Updated 5 years ago
SR0920 / CiT-Net
View on GitHub
☆11Jul 11, 2025Updated last year
zhangcheng828 / TensorRT-Plugin
View on GitHub
☆46Apr 7, 2022Updated 4 years ago
LixinLu42 / PerspectiveTransform
View on GitHub
进行畸变矫正，以及使用单应矩阵H进行逆透视变换成IPM图
☆11Jul 9, 2019Updated 7 years ago
KarhouTam / cuda-kernels
View on GitHub
Some common CUDA kernel implementations (Not the fastest).
☆30Jun 24, 2026Updated 3 weeks ago
rail-berkeley / tensorrt-openvla
View on GitHub
☆24Apr 30, 2025Updated last year
zzz3bbb3 / yolact-trt
View on GitHub
segmentation algorithm yolact use tensorrt deploy
☆14May 7, 2022Updated 4 years ago
JulianSchmid / example-bazel-add-git-hash
View on GitHub
Example for baking the current git commit hash into a bazel C++ project
☆11Jan 25, 2022Updated 4 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
yhwang-hub / yolov7_QAT
View on GitHub
Quantize yolov7 using pytorch_quantization.🚀🚀🚀
☆12Oct 20, 2023Updated 2 years ago
restran / wiz-search
View on GitHub
✏️ Offline Full Text Search for Wiz Note Mac Client
☆10May 15, 2019Updated 7 years ago
taifyang / PCL-algorithm
View on GitHub
some algorithms in PCL
☆19Jun 3, 2023Updated 3 years ago
ashishpatel26 / Facebook-AI-DEtection-TRansformer-DETR-Object-Detection
View on GitHub
End-to-End Object Detection with Transformers
☆14May 31, 2020Updated 6 years ago
dingyuqing05 / trt2022_wenet
View on GitHub
☆70Dec 9, 2022Updated 3 years ago
TRT2022 / MST-plus-plus-TensorRT
View on GitHub
TensorRT 2022复赛方案：首个基于Transformer的图像重建模型MST++的TensorRT模型推断优化
☆144Jul 6, 2022Updated 4 years ago
Bruce-Lee-LY / flash_attention_inference
View on GitHub
Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.
☆45Feb 27, 2025Updated last year
ShaYeBuHui01 / flash_attention_inference
View on GitHub
Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.
☆15Aug 31, 2023Updated 2 years ago
ZiadElmassik / CADCD_TO_KITTI
View on GitHub
☆13Jul 27, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
TeaPoly / warp-ctc-crf
View on GitHub
An extension of thu-spmi/CAT which contains a full-fledged implementation of CTC-CRF for Tensorflow.
☆12Jul 5, 2021Updated 5 years ago
DerryHub / BEVFormer_tensorrt
View on GitHub
BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).
☆578Nov 20, 2023Updated 2 years ago
cshbli / yolov5_qat_tensorrt
View on GitHub
YOLOv5 Quantization Aware Training with TensorRT
☆28Jan 10, 2023Updated 3 years ago
revilokeb / vgg16_batchnorm
View on GitHub
VGG16 architecture with BatchNorm
☆14Apr 4, 2017Updated 9 years ago
liu-chun-wu / NoteGenius
View on GitHub
☆13Jun 25, 2025Updated last year
zmyzxb / HOME
View on GitHub
An implementation of HOME: Heatmap Output for future Motion Estimation
☆13Feb 7, 2022Updated 4 years ago
rc-dukes / dash2
View on GitHub
Real-time motion planner and autonomous vehicle simulator in the browser, built with WebGL and Three.js.
☆13Jun 25, 2026Updated 3 weeks ago
emptysoal / Deepsort-YOLOv5-TensorRT
View on GitHub
An object tracking project with YOLOv5-v5.0 and Deepsort, speed up by C++ and TensorRT.
☆16Oct 23, 2025Updated 9 months ago
leohsuofnthu / Pytorch-IterativeFCN
View on GitHub
Pytorch implementation of the paper Iterative fully convolutional neural networks for automatic vertebra segmentation accepted in MIDL201…
☆67Jul 6, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
sunkx109 / llama.cpp
View on GitHub
llama 2 Inference
☆43Nov 4, 2023Updated 2 years ago
istoony / winograd-convolutional-nn
View on GitHub
I'm going to use the Winograd’s minimal ﬁltering algorithms to introduce a new class of fast algorithms for convolutional neural networks…
☆12Mar 22, 2018Updated 8 years ago
Jianxff / svo_pro_universal
View on GitHub
Plain cmake version for rpg_svo_pro_open (svo2.0). No ros.
☆14Apr 11, 2024Updated 2 years ago
zhuzilin / pytorch-malloc
View on GitHub
An external memory allocator example for PyTorch.
☆16Aug 10, 2025Updated 11 months ago
boostorg / align
View on GitHub
Boost.Align
☆16Jul 13, 2026Updated last week
RuningMangoPi / yolov8_QAT
View on GitHub
☆17Oct 16, 2023Updated 2 years ago
Irwin-Liu / hfnet-tf2onnx
View on GitHub
Change HFNet trained model from Tensorflow to ONNX
☆12Jan 3, 2020Updated 6 years ago