liu-mengyang / rust-mmdeploy
Safe MMDeploy Rust wrapper.
☆19Updated last year
Related projects ⓘ
Alternatives and complementary repositories for rust-mmdeploy
- ☆32Updated 9 months ago
- A tool convert TensorRT engine/plan to a fake onnx☆37Updated 2 years ago
- Serving Inside Pytorch☆145Updated this week
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆40Updated last year
- A Toolkit to Help Optimize Onnx Model☆81Updated this week
- ☆35Updated 2 weeks ago
- The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.☆125Updated 2 weeks ago
- OneFlow->ONNX☆42Updated last year
- ☆13Updated 7 months ago
- ☆118Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆15Updated 5 months ago
- Deploy RT-EDTR with onnx from paddlepaddle framwork and graph cut☆28Updated last year
- MegEngine到其他框架的转换器☆67Updated last year
- Training LLaMA language model with MMEngine! It supports LoRA fine-tuning!☆40Updated last year
- Standalone Flash Attention v2 kernel without libtorch dependency☆98Updated 2 months ago
- Datasets, Transforms and Models specific to Computer Vision☆83Updated last year
- A Toolkit to Help Optimize Large Onnx Model☆149Updated 6 months ago
- ☆30Updated 2 years ago
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆44Updated last year
- Models and examples built with OneFlow☆96Updated last month
- OneFlow Serving☆20Updated 9 months ago
- ☆59Updated 4 months ago
- 📒A small curated list of Awesome Diffusion Inference Papers with codes.☆96Updated this week
- ☆23Updated last year
- simplify >2GB large onnx model☆44Updated 8 months ago
- Paddle Automatically Diff Precision Toolkits.☆47Updated 7 months ago
- 天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛 初赛第三名方案☆47Updated last year
- Explore LLM model deployment based on AXera's AI chips☆54Updated last week
- Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.☆29Updated 2 months ago
- Tutorials for writing high-performance GPU operators in AI frameworks.☆123Updated last year