BestAnHongjun / LMDeploy-Jetson
Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function independently without continuous internet access.
☆70Updated 5 months ago
Related projects: ⓘ
- ☆121Updated 8 months ago
- An onnx-based quantitation tool.☆69Updated 8 months ago
- 大模型部署实战:TensorRT-LLM, Triton Inference Server, vLLM☆25Updated 6 months ago
- Accelerate segment anything model inference using Tensorrt 8.6.1.6☆78Updated 11 months ago
- llm-export can export llm model to onnx.☆193Updated this week
- Collection of image and video datasets for generative AI and multimodal visual AI☆17Updated 4 months ago
- ☆116Updated last year
- Minicpm和MiniCPM-V的项目和教程。包括推理,量化,边端部署,微调,技术报告、应用六个主题☆87Updated last week
- Llama3 Streaming Chat Sample☆23Updated 4 months ago
- A toolbox of yolo models and algorithms based on MindSpore☆91Updated last week
- run ChatGLM2-6B in BM1684X☆48Updated 6 months ago
- 该代码与B站上的视频 https://www.bilibili.com/video/BV18L41197Uz/?spm_id_from=333.788&vd_source=eefa4b6e337f16d87d87c2c357db8ca7 相关联。☆59Updated 11 months ago
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆40Updated 11 months ago
- Explore LLM model deployment based on AXera's AI chips☆48Updated 2 weeks ago
- https://start.oneflow.org/oneflow-yolo-doc☆22Updated last year
- A simple tool that can generate TensorRT plugin code quickly.☆216Updated last year
- mllm-npu: training multimodal large language models on Ascend NPUs☆77Updated 3 weeks ago
- ☆108Updated 6 months ago
- LLM101n: Let's build a Storyteller 中文版☆113Updated last month
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆39Updated 11 months ago
- README.md☆42Updated last year
- 多模态 MM +Chat 合集☆187Updated 2 weeks ago
- ☆23Updated last year
- This project showcases the deployment of the RT-DETR model using ONNXRUNTIME in C++ and Python.☆41Updated last year
- ☆50Updated last year
- Compare multiple optimization methods on triton to imporve model service performance☆46Updated 8 months ago
- A unified evaluation library for multiple machine learning libraries☆251Updated 5 months ago
- For 2022 Nvidia Hackathon☆17Updated 2 years ago
- Using pattern matcher in onnx model to match and replace subgraphs.☆73Updated 7 months ago
- ☆46Updated 6 months ago