sophgo / LLM-TPU
Run generative AI models in sophgo BM1684X
☆152Updated this week
Alternatives and similar repositories for LLM-TPU:
Users that are interested in LLM-TPU are comparing it to the libraries listed below
- llm-export can export llm model to onnx.☆255Updated last week
- run ChatGLM2-6B in BM1684X☆49Updated 10 months ago
- Samples code for world class Artificial Intelligence SoCs for computer vision applications.☆233Updated 2 months ago
- Explore LLM model deployment based on AXera's AI chips☆69Updated 2 weeks ago
- export llama to onnx☆111Updated 3 weeks ago
- LLaMa/RWKV onnx models, quantization and testcase☆356Updated last year
- simplify >2GB large onnx model☆51Updated last month
- ☆37Updated 6 months ago
- Machine learning compiler based on MLIR for Sophgo TPU.☆648Updated 2 weeks ago
- ☆284Updated this week
- A high performance, high expansion, easy to use framework for AI application. 为AI应用的开发者提供一套统一的高性能、易用的编程框架,快速基于AI全栈服务、开发跨端边云的AI行业应用,支持GPU,…☆143Updated 7 months ago
- ☆87Updated this week
- DDK for Rockchip NPU☆62Updated 4 years ago
- 基于MNN-llm的安卓手机部署大语言模型:Qwen1.5-0.5B-Chat☆62Updated 9 months ago
- PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)☆77Updated this week
- A Toolkit to Help Optimize Large Onnx Model☆151Updated 8 months ago
- stable diffusion using mnn☆65Updated last year
- llm deploy project based onnx.☆30Updated 3 months ago
- ☆27Updated 2 months ago
- ☆57Updated last month
- ☆127Updated 3 weeks ago
- Examples for SophonSDK☆105Updated 2 years ago
- ☆501Updated last month
- Compare multiple optimization methods on triton to imporve model service performance☆48Updated last year
- run chatglm3-6b in BM1684X☆38Updated 10 months ago
- ☆140Updated 8 months ago
- PyTorch Neural Network eXchange☆551Updated 3 weeks ago
- ☆124Updated last year
- A Toolkit to Help Optimize Onnx Model☆101Updated this week