yinghuo302 / ascend-llm
基于昇腾310芯片的大语言模型部署
☆17Updated 10 months ago
Alternatives and similar repositories for ascend-llm:
Users that are interested in ascend-llm are comparing it to the libraries listed below
- ☆41Updated 5 months ago
- Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function ind…☆90Updated last year
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆42Updated last year
- ☆11Updated 2 months ago
- MoE model with onnx runtime☆37Updated 11 months ago
- run ChatGLM2-6B in BM1684X☆49Updated last year
- ☆40Updated 9 months ago
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆46Updated last year
- Run generative AI models in sophgo BM1684X☆199Updated this week
- Explore LLM model deployment based on AXera's AI chips☆100Updated this week
- ☆336Updated this week
- This project showcases the deployment of the RT-DETR model using ONNXRUNTIME in C++ and Python.☆52Updated last year
- An onnx-based quantitation tool.☆71Updated last year
- Triton Documentation in Chinese Simplified / Triton 中文文档☆67Updated last week
- llm deploy project based onnx.☆36Updated 6 months ago
- 大模型部署实战:TensorRT-LLM, Triton Inference Server, vLLM☆26Updated last year
- llm-export can export llm model to onnx.☆282Updated 3 months ago
- Easy Training Official YOLOv8、YOLOv7、YOLOv6、YOLOv5 and Prune all_model using Torch-Pruning!☆62Updated last year
- ☆32Updated last year
- ☆105Updated 3 weeks ago
- ☆24Updated last year
- 该代码与B站上的视频 https://www.bilibili.com/video/BV18L41197Uz/?spm_id_from=333.788&vd_source=eefa4b6e337f16d87d87c2c357db8ca7 相关联。☆67Updated last year
- ☆133Updated last year
- Compare multiple optimization methods on triton to imporve model service performance☆50Updated last year
- Examples for SophonSDK☆105Updated 2 years ago
- ☆19Updated last year
- Large Language Model Onnx Inference Framework☆32Updated 3 months ago
- 高效部署:YOLO X, V3, V4, V5, V6, V7, V8, EdgeYOLO TRT推理 ™️ ,前后处理均由CUDA核函数实现 CPP/CUDA🚀☆49Updated 2 years ago
- 在rk3588平台利用rkllmrt的api实现deepseek-r1-1.5b蒸馏模型的部署☆12Updated 2 months ago
- TensorRT 2022 亚军方案,tensorrt加速mobilevit模型☆65Updated 2 years ago