dingyuqing05 / trt2022_wenetLinks

☆71

Alternatives and similar repositories for trt2022_wenet

Users that are interested in trt2022_wenet are comparing it to the libraries listed below

Sorting:

huismiling / wenet_trt8
☆75Updated 3 years ago
torchpipe / torchpipe
Serving Inside Pytorch
☆165Updated 2 weeks ago
BBuf / onnx_learn
☆102Updated 4 years ago
YellowOldOdd / SDBI
Simple Dynamic Batching Inference
☆145Updated 3 years ago
Tlntin / trt2023
☆26Updated 2 years ago
luchangli03 / onnxsim_large_model
simplify >2GB large onnx model
☆69Updated last year
DeepVAC / libdeepvac
Use PyTorch model in C++ project
☆140Updated 4 years ago
luchangli03 / export_llama_to_onnx
export llama to onnx
☆137Updated 11 months ago
xiatwhu / trt2023
☆27Updated 2 years ago
TRT2022 / MST-plus-plus-TensorRT
TensorRT 2022复赛方案：首个基于Transformer的图像重建模型MST++的TensorRT模型推断优化
☆143Updated 3 years ago
tsingmicro-toolchain / OnnxSlim
A Toolkit to Help Optimize Large Onnx Model
☆162Updated last month
Eddie-Wang1120 / Eddie-Wang-Hackathon2023
Whisper inference with TensorRT-LLM
☆24Updated 2 years ago
Deep-Spark / DeepSparkInference
DeepSparkInference has selected 216 inference models of both small and large sizes. The small models cover fields such as computer vision…
☆26Updated this week
TRT2022 / trtllm-llama
☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化
☆50Updated 2 years ago
tpoisonooo / chgemm
symmetric int8 gemm
☆67Updated 5 years ago
bug-developer021 / YOLOV5_optimization_on_triton
Compare multiple optimization methods on triton to imporve model service performance
☆52Updated last year
zzk0 / triton
Triton Inferece Server Model Config and Client Scripts
☆32Updated 3 years ago
Tlntin / ChatGLM2-6B-TensorRT
☆90Updated 2 years ago
OpenPPL / ppl.nn.llm
☆140Updated last year
BBuf / onnx2X
ONNX2Pytorch
☆165Updated 4 years ago
wangzhaode / llm-export
llm-export can export llm model to onnx.
☆333Updated last month
Oldpan / DeployIsAllYouNeed
☆120Updated 2 years ago
Oneflow-Inc / models
Models and examples built with OneFlow
☆100Updated last year
gesanqiu / Chinese_MobileBert_on_SNPE
Run Chinese MobileBert model on SNPE.
☆15Updated 2 years ago
BBuf / model-compression
model compression based on pytorch (1、quantization: 8/4/2bits(dorefa)、ternary/binary value(twn/bnn/xnor-net)；2、 pruning: normal、regular a…
☆171Updated 5 years ago
lucasjinreal / AI-Infer-Engine-From-Zero
关于自建AI推理引擎的手册，从0开始你需要知道的所有事情
☆271Updated 3 years ago
leimao / ONNX-Python-Examples
ONNX Python Examples
☆16Updated 3 years ago
void-main / FasterTransformer
Transformer related optimization, including BERT, GPT
☆59Updated 2 years ago
FeiGeChuanShu / trt2023
NVIDIA TensorRT Hackathon 2023复赛选题：通义千问Qwen-7B用TensorRT-LLM模型搭建及优化
☆43Updated 2 years ago
Rayrtfr / FasterTransformer
Transformer related optimization, including BERT, GPT
☆17Updated 2 years ago