sophgo / LLM-TPULinks

Run generative AI models in sophgo BM1684X/BM1688

☆230

Alternatives and similar repositories for LLM-TPU

Users that are interested in LLM-TPU are comparing it to the libraries listed below

Sorting:

wangzhaode / llm-export
llm-export can export llm model to onnx.
☆302Updated 6 months ago
AXERA-TECH / ax-llm
Explore LLM model deployment based on AXera's AI chips
☆109Updated 2 weeks ago
sophgo / sophon-demo
☆400Updated last week
AXERA-TECH / ax-samples
Samples code for world class Artificial Intelligence SoCs for computer vision applications.
☆261Updated 4 months ago
sophgo / sophon-pipeline
☆42Updated last year
sophgo / ChatGLM2-TPU
run ChatGLM2-6B in BM1684X
☆49Updated last year
sophgo / tpu-mlir
Machine learning compiler based on MLIR for Sophgo TPU.
☆765Updated last week
sophgo / sophon-stream
☆120Updated last month
sophon-ai-algo / examples
Examples for SophonSDK
☆105Updated 2 years ago
modelbox-ai / modelbox
A high performance, high expansion, easy to use framework for AI application. 为AI应用的开发者提供一套统一的高性能、易用的编程框架，快速基于AI全栈服务、开发跨端边云的AI行业应用，支持GPU，…
☆156Updated last year
AXERA-TECH / ax-pipeline
The Pipeline example based on AXear-Pi (AX620A) , AXera-Pi Pro (AX650N) and AXera-Pi Zero (AX620Q) shows the software development skill…
☆5Updated last week
tpoisonooo / llama.onnx
LLaMa/RWKV onnx models, quantization and testcase
☆362Updated 2 years ago
PaddlePaddle / PaddleCustomDevice
PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)
☆89Updated this week
airockchip / rknn-llm
☆890Updated 3 weeks ago
HuPengsheet / use-ncnn
NCNN的代码学习，各种小Demo。
☆115Updated last year
MooreThreads / vllm_musa
A high-throughput and memory-efficient inference and serving engine for LLMs
☆54Updated 9 months ago
luchangli03 / export_llama_to_onnx
export llama to onnx
☆130Updated 7 months ago
rockchip-linux / rknpu2
☆751Updated last year
Tlntin / qwen-ascend-llm
☆48Updated 9 months ago
luchangli03 / onnxsim_large_model
simplify >2GB large onnx model
☆61Updated 8 months ago
pnnx / pnnx
PyTorch Neural Network eXchange
☆605Updated this week
VeriSilicon / TIM-VX
VeriSilicon Tensor Interface Module
☆236Updated 6 months ago
tsingmicro-toolchain / OnnxSlim
A Toolkit to Help Optimize Large Onnx Model
☆157Updated last year
DataXujing / Qwen1.5-0.5b-chat-android
基于MNN-llm的安卓手机部署大语言模型：Qwen1.5-0.5B-Chat
☆81Updated last year
Tlntin / trt2023
☆26Updated last year
wangzhaode / onnx-llm
llm deploy project based onnx.
☆42Updated 9 months ago
OpenPPL / ppl.pmx
☆59Updated 8 months ago
ModelTC / LightCompress
[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a V…
☆522Updated this week
Arm-China / Model_zoo
Zhouyi model zoo
☆101Updated last month
AIFlowPlayer / LMDeploy-Jetson
Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function ind…
☆98Updated last year