yvonwin / qwen2.cppLinks

qwen2 and llama3 cpp implementation

☆49

Alternatives and similar repositories for qwen2.cpp

Users that are interested in qwen2.cpp are comparing it to the libraries listed below

Sorting:

daquexian / faster-rwkv
☆125Updated 2 years ago
wangzhaode / onnx-llm
llm deploy project based onnx.
☆49Updated last year
FeiGeChuanShu / trt2023
NVIDIA TensorRT Hackathon 2023复赛选题：通义千问Qwen-7B用TensorRT-LLM模型搭建及优化
☆43Updated 2 years ago
TRT2022 / trtllm-llama
☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化
☆51Updated 2 years ago
sophgo / ChatGLM2-TPU
run ChatGLM2-6B in BM1684X
☆49Updated last year
EdVince / whisper-trtllm
Whisper in TensorRT-LLM
☆17Updated 2 years ago
wangzhaode / mnn-stable-diffusion
stable diffusion using mnn
☆67Updated 2 years ago
lovemefan / ggml-learning-notes
ggml学习笔记，ggml是一个机器学习的推理框架
☆18Updated last year
guoguo1314 / llama3_learn.c
Inference deployment of the llama3
☆11Updated last year
EdVince / llm-cpp
☆33Updated last year
DataXujing / Qwen1.5-0.5b-chat-android
基于MNN-llm的安卓手机部署大语言模型：Qwen1.5-0.5B-Chat
☆89Updated last year
Tlntin / ChatGLM2-6B-TensorRT
☆90Updated 2 years ago
luchangli03 / onnxsim_large_model
simplify >2GB large onnx model
☆70Updated last year
TRT2022 / ControlNet_TensorRT
天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛初赛第三名方案
☆50Updated 2 years ago
MollySophia / rwkv-qualcomm
Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK
☆90Updated this week
lrw04 / tinyllamas-ncnn
Inference TinyLlama models on ncnn
☆24Updated 2 years ago
inisis / OnnxLLM
Large Language Model Onnx Inference Framework
☆36Updated 2 months ago
wangkuiyi / huggingface-tokenizer-in-cxx
☆70Updated 2 years ago
BaofengZan / GOT-OCRv2-onnx
用于学习GOT/Qwen/OnnxLLm
☆53Updated last year
tsingmicro-toolchain / OnnxSlim
A Toolkit to Help Optimize Large Onnx Model
☆163Updated 3 months ago
Peter-Chou / transformer_cpp_tokenizers
transformer tokenizers (e.g. BERT tokenizer) in C++ (WIP)
☆18Updated 3 years ago
DataXujing / TensorRT-LLM-ChatGLM3
大模型部署实战：TensorRT-LLM, Triton Inference Server, vLLM
☆27Updated last year
luchangli03 / export_llama_to_onnx
export llama to onnx
☆137Updated last year
wjc852456 / ONNX-TensorRT-LibTorch
deploy onnx models with TensorRT and LibTorch
☆19Updated 4 years ago
MollySophia / rwkv-mobile
Inference RWKV with multiple supported backends.
☆77Updated this week
FeiGeChuanShu / segment-anything-ncnn
an example of segment-anything infer by ncnn
☆123Updated 2 years ago
Rayrtfr / FasterTransformer
Transformer related optimization, including BERT, GPT
☆17Updated 2 years ago
wangzhaode / tokenizer.cpp
A lightweight, production-ready C++ library for LLM tokenization, fully compatible with HuggingFace tokenizer.json.
☆21Updated last month
Tlntin / qwen-ascend-llm
☆55Updated last year
AXERA-TECH / ax-llm
Explore LLM model deployment based on AXera's AI chips
☆139Updated this week