yinghuo302 / ascend-llm
基于昇腾310芯片的大语言模型部署
☆18Updated 9 months ago
Alternatives and similar repositories for ascend-llm:
Users that are interested in ascend-llm are comparing it to the libraries listed below
- ☆39Updated 5 months ago
- Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function ind…☆89Updated last year
- Large Language Model Onnx Inference Framework☆32Updated 2 months ago
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆41Updated last year
- ☆11Updated last month
- Examples for SophonSDK☆106Updated 2 years ago
- 彻底弄懂BP反向传播,15行代码,C++实现也简单,MNIST分类98.29%精度☆34Updated 2 years ago
- Triton Documentation in Chinese Simplified / Triton 中文文档☆62Updated 2 months ago
- MoE model with onnx runtime☆34Updated 10 months ago
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆46Updated last year
- run ChatGLM2-6B in BM1684X☆49Updated last year
- ☆35Updated this week
- 高效部署:YOLO X, V3, V4, V5, V6, V7, V8, EdgeYOLO TRT推理 ™️ ,前后处理均由CUDA核函数实现 CPP/CUDA🚀☆49Updated 2 years ago
- Easy Training Official YOLOv8、YOLOv7、YOLOv6、YOLOv5 and Prune all_model using Torch-Pruning!☆59Updated last year
- Explore LLM model deployment based on AXera's AI chips☆87Updated last week
- ☆83Updated 2 weeks ago
- ☆22Updated last year
- ☆40Updated 8 months ago
- ☆132Updated last year
- yolov5模型(.pt)在RK3588(S)上的部署(实时摄像头检测)☆50Updated last year
- ☆322Updated this week
- Inference code for LLaMA models☆118Updated last year
- A toolbox of yolo models and algorithms based on MindSpore☆121Updated last week
- simplify >2GB large onnx model☆54Updated 4 months ago
- Run generative AI models in sophgo BM1684X☆190Updated this week
- This project showcases the deployment of the RT-DETR model using ONNXRUNTIME in C++ and Python.☆52Updated last year
- async inference for machine learning model☆26Updated 2 years ago
- 大模型部署实战:TensorRT-LLM, Triton Inference Server, vLLM☆26Updated last year
- An onnx-based quantitation tool.☆71Updated last year
- This project implements YOLOv11 inference on the RK3588 platform using the RKNN framework. With deep optimization of the official code an…☆27Updated 3 months ago