BestAnHongjun / LMDeploy-Jetson
Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function independently without continuous internet access.
☆90Updated 10 months ago
Alternatives and similar repositories for LMDeploy-Jetson:
Users that are interested in LMDeploy-Jetson are comparing it to the libraries listed below
- 基于InternLM2大模型的离线具身智能导盲犬☆83Updated 10 months ago
- ☆130Updated last year
- 基于昇腾310芯片的大语言模型部署☆18Updated 8 months ago
- An onnx-based quantitation tool.☆71Updated last year
- mllm-npu: training multimodal large language models on Ascend NPUs☆90Updated 5 months ago
- 使用 CUDA C++ 实现的 llama 模型推理框架☆45Updated 3 months ago
- ☆39Updated 3 months ago
- ☆59Updated 7 months ago
- 该代码与B站上的视频 https://www.bilibili.com/video/BV18L41197Uz/?spm_id_from=333.788&vd_source=eefa4b6e337f16d87d87c2c357db8ca7 相关联。☆64Updated last year
- Llama3 Streaming Chat Sample☆22Updated 9 months ago
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆44Updated last year
- ☆20Updated last year
- DeepSpeed教程 & 示例注释 & 学习笔记 (大模型高效训练)☆150Updated last year
- A light llama-like llm inference framework based on the triton kernel.☆91Updated this week
- ☆37Updated 4 months ago
- llm theoretical performance analysis tools and support params, flops, memory and latency analysis.☆78Updated last month
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆45Updated 5 months ago
- A Light-Weight Framework for Open-Set Object Detection with Decoupled Feature Alignment in Joint Space☆63Updated last month
- Official PyTorch implementation of FlatQuant: Flatness Matters for LLM Quantization☆102Updated 3 weeks ago
- A repository used for record my learning process of TensorRT.☆10Updated 7 months ago
- async inference for machine learning model☆26Updated 2 years ago
- 大模型部署实战:TensorRT-LLM, Triton Inference Server, vLLM☆26Updated 11 months ago
- llm-export can export llm model to onnx.☆263Updated last month
- ☆62Updated last week
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆41Updated last year
- Inference code for LLaMA models☆113Updated last year
- TensorRT 2022 亚军方案,tensorrt加速mobilevit模型☆61Updated 2 years ago
- ☆23Updated last year
- Using pattern matcher in onnx model to match and replace subgraphs.☆77Updated last year
- run ChatGLM2-6B in BM1684X☆49Updated 11 months ago