Tlntin/ChatGLM2-6B-TensorRT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Tlntin/ChatGLM2-6B-TensorRT)

Tlntin / ChatGLM2-6B-TensorRT

☆90

Alternatives and similar repositories for ChatGLM2-6B-TensorRT

Users that are interested in ChatGLM2-6B-TensorRT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

TRT2022 / ControlNet_TensorRT
View on GitHub
天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛初赛第三名方案
☆50Aug 16, 2023Updated 2 years ago
Tlntin / trt2023
View on GitHub
☆26Aug 15, 2023Updated 2 years ago
K024 / chatglm-q
View on GitHub
Another ChatGLM2 implementation for GPTQ quantization
☆55Oct 15, 2023Updated 2 years ago
latentCall145 / channels-last-groupnorm
View on GitHub
A CUDA kernel for NHWC GroupNorm for PyTorch
☆23Nov 15, 2024Updated last year
EdVince / whisper-trtllm
View on GitHub
Whisper in TensorRT-LLM
☆17Sep 21, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
YukSing12 / anchordetr_tensorrt
View on GitHub
trt-hackathon-2022 三等奖方案
☆10Mar 6, 2023Updated 3 years ago
triple-mu / TensorRT2ONNX
View on GitHub
A tool convert TensorRT engine/plan to a fake onnx
☆41Nov 22, 2022Updated 3 years ago
keddyjin / TensorRT_StableDiffusion_ControlNet
View on GitHub
NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that del…
☆26Jul 21, 2023Updated 3 years ago
EdVince / llm-cpp
View on GitHub
☆34Jul 23, 2024Updated 2 years ago
wangzhaode / mnn-llm
View on GitHub
llm deploy project based mnn. This project has merged into MNN.
☆1,616Jan 20, 2025Updated last year
triple-mu / HunyuanDiT-TensorRT-libtorch
View on GitHub
HunyuanDiT with TensorRT and libtorch
☆18May 22, 2024Updated 2 years ago
triple-mu / Stable-Diffusion-TensorRT
View on GitHub
Stable Diffusion in TensorRT 8.5+
☆15Mar 19, 2023Updated 3 years ago
chineseocr / ai-medical
View on GitHub
陆续开源医疗行业的深度学习模型及数据集
☆13Dec 30, 2021Updated 4 years ago
richjjj / cuvid-tensorrt-multi
View on GitHub
ffmpeg+cuvid+tensorrt+multicamera
☆12Dec 31, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
tlc-pack / cutlass_fpA_intB_gemm
View on GitHub
A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer
☆96Jun 21, 2026Updated last month
cqu20160901 / FastSAM_onnx_rknn
View on GitHub
FastSAM 部署版本，便于移植不同平，部署简单、运行速度快。
☆25May 30, 2024Updated 2 years ago
Zhou-wy / TRT-YOLOv8-Seg
View on GitHub
使用TensorRT加速YOLOv8-Seg，完整的后端框架，包括Http服务器，Mysql数据库，ffmpeg视频推流等。
☆90Oct 9, 2023Updated 2 years ago
lxl24 / SwinTransformerV2_TensorRT
View on GitHub
For 2022 Nvidia Hackathon
☆22Jun 28, 2022Updated 4 years ago
richjjj / duscratch
View on GitHub
搜藏的希望的代码片段
☆13Jun 6, 2023Updated 3 years ago
dingyuqing05 / trt2022_wenet
View on GitHub
☆70Dec 9, 2022Updated 3 years ago
shouxieai / A-series-of-CV
View on GitHub
☆56Aug 21, 2023Updated 2 years ago
tlc-pack / libflash_attn
View on GitHub
Standalone Flash Attention v2 kernel without libtorch dependency
☆113Sep 10, 2024Updated last year
l-sf / Nanodet_openvino_quant_deploy
View on GitHub
本仓库在OpenVINO推理框架下部署Nanodet检测算法，并重写预处理和后处理部分，具有超高性能！让你在Intel CPU平台上的检测速度起飞！并基于NNCF和PPQ工具将模型量化(PTQ)至int8精度，推理速度更快！
☆16Jun 14, 2023Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
shibing624 / lmft
View on GitHub
ChatGLM-6B fine-tuning.
☆135Apr 25, 2023Updated 3 years ago
yhwang-hub / dl_model_deploy
View on GitHub
☆79May 16, 2023Updated 3 years ago
sesmfs / onnx_quant_tool
View on GitHub
An onnx-based quantitation tool.
☆71Jan 8, 2024Updated 2 years ago
shouxieai / hard_decode_trt
View on GitHub
Yolov5 inference on NVDec hardware decoder
☆91Nov 6, 2021Updated 4 years ago
taishan1994 / Chinese-LLaMA-Alpaca-LoRA-Tuning
View on GitHub
使用LoRA对Chinese-LLaMA-Alpaca进行微调。
☆35May 26, 2023Updated 3 years ago
wangzhaode / llm-export
View on GitHub
llm-export can export llm model to onnx.
☆355May 8, 2026Updated 2 months ago
tensorchord / modelz-ChatGLM
View on GitHub
Deploy ChatGLM on Modelz
☆16Mar 20, 2023Updated 3 years ago
ZHEQIUSHUI / SAM-ONNX-AX650-CPP
View on GitHub
SAM and lama inpaint，包含QT的GUI交互界面，实现了交互式可实时显示结果的画点、画框进行SAM，然后通过进行Inpaint，具体操作看readme里的视频。
☆54Jan 30, 2024Updated 2 years ago
jimmy-evo / opencl_kernels
View on GitHub
An easy way to run, test, benchmark and tune OpenCL kernel files
☆24Aug 25, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
DataXujing / TensorRT-DETR
View on GitHub
NVIDIA-阿里2021 TRT比赛 `二等奖` 代码提交团队：美迪康 AI Lab
☆177Jul 29, 2022Updated 3 years ago
wangzhaode / mnn-stable-diffusion
View on GitHub
stable diffusion using mnn
☆68Sep 28, 2023Updated 2 years ago
yongzhuo / chatglm-maths
View on GitHub
chatglm-6b微调/LORA/PPO/推理, 样本为自动生成的整数/小数加减乘除运算, 可gpu/cpu
☆165Aug 24, 2023Updated 2 years ago
sesmfs / onnx_matcher
View on GitHub
Using pattern matcher in onnx model to match and replace subgraphs.
☆81Feb 7, 2024Updated 2 years ago
openvino-dev-samples / decode-infer-on-GPU
View on GitHub
This sample shows how to use the oneAPI Video Processing Library (oneVPL) to perform a single and multi-source video decode and preproces…
☆15Jun 15, 2023Updated 3 years ago
hpc203 / face-gaze-estimation-opencv-dnn
View on GitHub
使用OpenCV部署L2CS-Net人脸朝向估计，包含C++和Python两个版本的程序，只依赖opencv库就可以运行
☆21Aug 12, 2023Updated 2 years ago
hnsywangxin / controlnet_stable_tensorrt
View on GitHub
stable diffusion, controlnet, tensorrt, accelerate
☆73Apr 28, 2023Updated 3 years ago