BestAnHongjun / LMDeploy-Jetson
Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function independently without continuous internet access.
☆80Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for LMDeploy-Jetson
- 基于InternLM2大模型的离线具身智能导盲犬☆66Updated 7 months ago
- 个人项目地址,一些大语言模型和多模态模型的应用☆123Updated 2 weeks ago
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆44Updated last year
- ☆127Updated 10 months ago
- An onnx-based quantitation tool.☆71Updated 10 months ago
- Train InternViT-6B in MMSegmentation and MMDetection with DeepSpeed☆58Updated 3 weeks ago
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆40Updated last year
- ☆26Updated 3 weeks ago
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆38Updated 2 months ago
- 该代码与B站上的视频 https://www.bilibili.com/video/BV18L41197Uz/?spm_id_from=333.788&vd_source=eefa4b6e337f16d87d87c2c357db8ca7 相关联。☆60Updated last year
- ☆19Updated 10 months ago
- mllm-npu: training multimodal large language models on Ascend NPUs☆83Updated 2 months ago
- Accelerate segment anything model inference using Tensorrt 8.6.1.6☆82Updated last year
- TensorRT 2022 亚军方案,tensorrt加速mobilevit模型☆59Updated 2 years ago
- 多模态 MM +Chat 合集☆209Updated 2 weeks ago
- ☆59Updated 4 months ago
- llm-export can export llm model to onnx.☆231Updated last week
- 大模型部署实战:TensorRT-LLM, Triton Inference Server, vLLM☆26Updated 8 months ago
- ☆20Updated this week
- run ChatGLM2-6B in BM1684X☆48Updated 8 months ago
- An Android Application for GLCC☆11Updated 2 years ago
- Using pattern matcher in onnx model to match and replace subgraphs.☆75Updated 9 months ago
- Llama3 Streaming Chat Sample☆23Updated 7 months ago
- LLM 推理服务性能测试☆27Updated 11 months ago
- ☆23Updated last year
- 基于昇腾310芯片的大语言模型部署☆13Updated 5 months ago
- [ICCV2023] TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance☆66Updated 4 months ago
- [EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a V …☆324Updated this week
- Compare multiple optimization methods on triton to imporve model service performance☆46Updated 10 months ago
- ☆51Updated 8 months ago