yinghuo302 / ascend-llmLinks

基于昇腾310芯片的大语言模型部署

☆24

Alternatives and similar repositories for ascend-llm

Users that are interested in ascend-llm are comparing it to the libraries listed below

Sorting:

BestAnHongjun / LMDeploy-Jetson
Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function ind…
☆103Updated last year
Tlntin / qwen-ascend-llm
☆52Updated last year
thb1314 / mmyolo_tensorrt
☆150Updated last year
wangzhaode / llm-export
llm-export can export llm model to onnx.
☆330Updated last month
mindspore-lab / mindyolo
A toolbox of yolo models and algorithms based on MindSpore
☆163Updated last week
shouxieai / tensorRT_quantization
该代码与B站上的视频 https://www.bilibili.com/video/BV18L41197Uz/?spm_id_from=333.788&vd_source=eefa4b6e337f16d87d87c2c357db8ca7 相关联。
☆70Updated 2 years ago
sesmfs / onnx_quant_tool
An onnx-based quantitation tool.
☆71Updated last year
sophgo / LLM-TPU
Run generative AI models in sophgo BM1684X/BM1688
☆253Updated last week
sophgo / sophon-demo
☆439Updated last week
Ascend / samples
☆145Updated 2 years ago
bug-developer021 / YOLOV5_optimization_on_triton
Compare multiple optimization methods on triton to imporve model service performance
☆52Updated last year
yjh0410 / YOLO-Tutorial-v2
☆69Updated 7 months ago
Susan19900316 / yolov5_tensorrt_int8
yolov5 tensorrt int8量化方法汇总
☆83Updated last year
shouxieai / learning-cuda-trt
learning-cuda-trt
☆118Updated 2 years ago
kaylorchen / ai_framework_demo
针对于ai_framefork的测试demo
☆42Updated last month
Oneflow-Inc / one-yolov5
A more efficient yolov5 with oneflow backend 🎉🎉🎉
☆216Updated 4 months ago
PaddlePaddle / PaddleCustomDevice
PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)
☆100Updated last week
huangzongmou / yolov8-pytorch_quantization
使用pytorch_quantization对yolov8进行量化
☆118Updated 2 years ago
sesmfs / onnx_matcher
Using pattern matcher in onnx model to match and replace subgraphs.
☆81Updated last year
CYYAI / AiInfer
☆114Updated last year
455670288 / rknn-yolov8s-multi-thread-inference
yolov8s在rk3588的推理部署，并使用多线程池并行npu推理加速
☆50Updated last year
DataXujing / Qwen1.5-0.5b-chat-android
基于MNN-llm的安卓手机部署大语言模型：Qwen1.5-0.5B-Chat
☆86Updated last year
ShaohonChen / Qwen3-SmVL
将SmolVLM2的视觉头与Qwen3-0.6B模型进行了拼接微调
☆445Updated 2 months ago
sophon-ai-algo / examples
Examples for SophonSDK
☆107Updated 3 years ago
pandada8 / llm-inference-benchmark
LLM 推理服务性能测试
☆44Updated last year
sophgo / sophon-pipeline
☆42Updated last year
yhwang-hub / dl_model_infer
🚀🚀🚀This is an AI high-performance reasoning C++ library, Currently supports the deployment of yolov5, yolov7, yolov7-pose, yolov8, yol…
☆136Updated last year
sophgo / sophon-stream
☆126Updated 3 months ago
FeiGeChuanShu / trt2023
NVIDIA TensorRT Hackathon 2023复赛选题：通义千问Qwen-7B用TensorRT-LLM模型搭建及优化
☆43Updated 2 years ago
sophgo / ChatGLM2-TPU
run ChatGLM2-6B in BM1684X
☆50Updated last year