guoguo1314 / llama3_learn.c

Inference deployment of the llama3

☆11

Alternatives and similar repositories for llama3_learn.c:

Users that are interested in llama3_learn.c are comparing it to the libraries listed below

wangzhaode / onnx-llm
llm deploy project based onnx.
☆30Updated 3 months ago
lovemefan / ggml-learning-notes
ggml学习笔记，ggml是一个机器学习的推理框架
☆14Updated 9 months ago
FeiGeChuanShu / trt2023
NVIDIA TensorRT Hackathon 2023复赛选题：通义千问Qwen-7B用TensorRT-LLM模型搭建及优化
☆41Updated last year
yvonwin / qwen2.cpp
qwen2 and llama3 cpp implementation
☆37Updated 7 months ago
ozanarmagan / clip_tokenizer_cpp
☆10Updated 6 months ago
TRT2022 / trtllm-llama
☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化
☆45Updated last year
DataXujing / TensorRT-LLM-ChatGLM3
大模型部署实战：TensorRT-LLM, Triton Inference Server, vLLM
☆26Updated 10 months ago
triple-Mu / HunyuanDiT-TensorRT-libtorch
HunyuanDiT with TensorRT and libtorch
☆17Updated 7 months ago
AXERA-TECH / CLIP-ONNX-AX650-CPP
☆21Updated 3 weeks ago
ZHEQIUSHUI / CLIP-ONNX-AX650-CPP
c++实现的clip推理，模型有一点点改动，但是不大，改动和导出模型的代码可以在readme里找到，模型文件都在Releases里，包括AX650的模型。新增支持ChineseCLIP
☆29Updated 3 weeks ago
luchangli03 / onnxsim_large_model
simplify >2GB large onnx model
☆51Updated last month
richjjj / cuvid-tensorrt-multi
ffmpeg+cuvid+tensorrt+multicamera
☆12Updated 2 weeks ago
ZHEQIUSHUI / SAM-ONNX-AX650-CPP
SAM and lama inpaint，包含QT的GUI交互界面，实现了交互式可实时显示结果的画点、画框进行SAM，然后通过进行Inpaint，具体操作看readme里的视频。
☆44Updated 11 months ago
wangzyon / trt_learn
TensorRT encapsulation, learn, rewrite, practice.
☆27Updated 2 years ago
EdVince / llm-cpp
☆32Updated 5 months ago
inisis / OnnxLLM
Large Language Model Onnx Inference Framework
☆28Updated last week
Tlntin / trt2023
☆23Updated last year
ChunelFeng / CGraph-lite
A lite and head-only CGraph-API-liked DAG project.
☆14Updated 2 months ago
TRT2022 / ControlNet_TensorRT
天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛初赛第三名方案
☆48Updated last year
caiwanxianhust / FasterLLaMA
使用 CUDA C++ 实现的 llama 模型推理框架
☆43Updated 2 months ago
daquexian / faster-rwkv
☆124Updated last year
yhwang-hub / yolov5_QAT
Quantize yolov5 using pytorch_quantization.🚀🚀🚀
☆13Updated last year
yuxiaoranyu / stable_diffusion_trt_triton
☆19Updated last year
harleyszhang / lite_llama
A light llama-like llm inference framework based on the triton kernel.
☆77Updated last week
zjd1988 / rknn_backend
☆17Updated last year
wangzhaode / mnn-stable-diffusion
stable diffusion using mnn
☆65Updated last year
Zheng-Bicheng / BreezeDeploy
☆11Updated 8 months ago
DataXujing / Qwen1.5-0.5b-chat-android
基于MNN-llm的安卓手机部署大语言模型：Qwen1.5-0.5B-Chat
☆62Updated 9 months ago
AXERA-TECH / SAM-ONNX-AX650-CPP
☆14Updated last year
triple-Mu / Stable-Diffusion-TensorRT
Stable Diffusion in TensorRT 8.5+
☆14Updated last year