viitrix / vt-transformerLinks

Transformer framework for edge computing based on C++.

☆128

Alternatives and similar repositories for vt-transformer

Users that are interested in vt-transformer are comparing it to the libraries listed below

Sorting:

01-ai / Descartes
☆112Updated last year
sophgo / ChatGLM2-TPU
run ChatGLM2-6B in BM1684X
☆50Updated last year
yvonwin / qwen2.cpp
qwen2 and llama3 cpp implementation
☆48Updated last year
modelscope / dash-infer
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …
☆266Updated 2 months ago
QwenLM / qwen.cpp
C++ implementation of Qwen-LM
☆606Updated 10 months ago
modelbox-ai / modelbox
A high performance, high expansion, easy to use framework for AI application. 为AI应用的开发者提供一套统一的高性能、易用的编程框架，快速基于AI全栈服务、开发跨端边云的AI行业应用，支持GPU，…
☆156Updated last year
FittenTech / OpenLLaMA-Chinese
OpenLLaMA-Chinese, a permissively licensed open source instruction-following models based on OpenLLaMA
☆66Updated 2 years ago
Tlntin / qwen-ascend-llm
☆50Updated 11 months ago
chenyangMl / llama2.c-zh
支持中文场景的的小语言模型 llama2.c-zh
☆150Updated last year
torchpipe / torchpipe
Serving Inside Pytorch
☆163Updated 3 weeks ago
Tlntin / ChatGLM2-6B-TensorRT
☆90Updated 2 years ago
wangzhaode / llm-export
llm-export can export llm model to onnx.
☆314Updated last month
hpcaitech / SwiftInfer
Efficient AI Inference & Serving
☆478Updated last year
tc-mb / llama.cpp
Port of Facebook's LLaMA model in C/C++
☆102Updated this week
MooreThreads / vllm_musa
A high-throughput and memory-efficient inference and serving engine for LLMs
☆64Updated 11 months ago
infinigence / Infini-Megrez
☆337Updated last week
1694439208 / GOT-OCR-Inference
研究GOT-OCR-项目落地加速，不限语言
☆62Updated 11 months ago
ouwei2013 / baichuan13b.cpp
ggml implementation of the baichuan13b model (adapted from llama.cpp)
☆55Updated 2 years ago
zai-org / GLM-Edge
GLM Series Edge Models
☆149Updated 4 months ago
zhaohb / fastapi_tritonserver
☆27Updated 11 months ago
ling0322 / libllm
Efficient inference of large language models.
☆149Updated 3 weeks ago
inisis / OnnxLLM
Large Language Model Onnx Inference Framework
☆36Updated 9 months ago
360CVGroup / SEEChat
Multimodal chatbot with computer vision capabilities integrated, our 1st-gen LMM
☆101Updated last year
DataXujing / Qwen1.5-0.5b-chat-android
基于MNN-llm的安卓手机部署大语言模型：Qwen1.5-0.5B-Chat
☆85Updated last year
AXERA-TECH / ax-llm
Explore LLM model deployment based on AXera's AI chips
☆117Updated last week
infinigence / Infini-Megrez-Omni
☆240Updated 7 months ago
OpenCSGs / llm-inference
llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deploy…
☆86Updated last year
wangzhaode / onnx-llm
llm deploy project based onnx.
☆45Updated last year
wangzhaode / mnn-stable-diffusion
stable diffusion using mnn
☆67Updated 2 years ago
OpenBMB / MobileCPM
A Toolkit for Running On-device Large Language Models (LLMs) in APP
☆78Updated last year