luchangli03 / onnxsim_large_model
simplify >2GB large onnx model
☆44Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for onnxsim_large_model
- export llama to onnx☆96Updated 5 months ago
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆44Updated last year
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆40Updated last year
- ☆123Updated 11 months ago
- A Toolkit to Help Optimize Large Onnx Model☆148Updated 6 months ago
- run ChatGLM2-6B in BM1684X☆48Updated 8 months ago
- ☆57Updated 2 weeks ago
- ☆140Updated 6 months ago
- llm-export can export llm model to onnx.☆230Updated last week
- Decoding Attention is specially optimized for multi head attention (MHA) using CUDA core for the decoding stage of LLM inference.☆23Updated 2 weeks ago
- ☆123Updated 2 weeks ago
- An easy-to-use package for implementing SmoothQuant for LLMs☆83Updated 6 months ago
- ☆23Updated last year
- ☆70Updated last year
- Large Language Model Onnx Inference Framework☆25Updated last month
- Transformer related optimization, including BERT, GPT☆17Updated last year
- Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.☆29Updated 2 months ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆20Updated 8 months ago
- 天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛 初赛第三名方案☆47Updated last year
- ☆90Updated last year
- 基于MNN-llm的安卓手机部署大语言模型:Qwen1.5-0.5B-Chat☆48Updated 7 months ago
- stable diffusion using mnn☆63Updated last year
- ☆99Updated 8 months ago
- ☆138Updated 2 weeks ago
- ☆26Updated last year
- A quantization algorithm for LLM☆101Updated 5 months ago
- 大模型部署实战:TensorRT-LLM, Triton Inference Server, vLLM☆26Updated 8 months ago
- llm deploy project based onnx.☆26Updated last month
- A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ, and export to onnx/onnx-runtime easily.☆149Updated last month
- Transformer related optimization, including BERT, GPT☆60Updated last year