cavedweller509/LMDeploy-Jetson

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cavedweller509/LMDeploy-Jetson)

cavedweller509 / LMDeploy-Jetson

Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function independently without continuous internet access.

☆106

Alternatives and similar repositories for LMDeploy-Jetson

Users that are interested in LMDeploy-Jetson are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ModelTC / msbench
View on GitHub
A tool for model sparse based on torch.fx
☆13Jun 3, 2024Updated 2 years ago
gesanqiu / Ericsson-Yolov3-SNPE
View on GitHub
关于算法处理实时视频流性能不足使用并行处理的方案和优化（APP层面）。
☆11Jun 5, 2021Updated 5 years ago
triple-mu / HunyuanDiT-TensorRT-libtorch
View on GitHub
HunyuanDiT with TensorRT and libtorch
☆18May 22, 2024Updated 2 years ago
AXERA-TECH / ax-llm
View on GitHub
Explore LLM model deployment based on AXera's AI chips
☆163Updated this week
OpenPPL / ppl.llm.kernel.cuda
View on GitHub
☆150Jan 9, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
dusty-nv / NanoDB
View on GitHub
Zero-copy multimodal vector DB with CUDA and CLIP/SigLIP
☆65May 6, 2025Updated last year
NVIDIA-AI-IOT / jetson-copilot
View on GitHub
A reference application for a local AI assistant with LLM and RAG
☆127Dec 5, 2024Updated last year
richjjj / duscratch
View on GitHub
搜藏的希望的代码片段
☆13Jun 6, 2023Updated 3 years ago
OpenGVLab / Official-ConvMAE-Det
View on GitHub
☆18Aug 23, 2022Updated 3 years ago
taishan1994 / MiniClip
View on GitHub
动手训练一个简单的CLIP模型，加深对CLIP的理解。
☆27May 20, 2025Updated last year
sesmfs / onnx_quant_tool
View on GitHub
An onnx-based quantitation tool.
☆71Jan 8, 2024Updated 2 years ago
yhwang-hub / dl_model_infer
View on GitHub
🚀🚀🚀This is an AI high-performance reasoning C++ library, Currently supports the deployment of yolov5, yolov7, yolov7-pose, yolov8, yol…
☆137May 4, 2024Updated 2 years ago
ChuanyangZheng / L2ViT
View on GitHub
Official PyTorch implementation of The Linear Attention Resurrection in Vision Transformer
☆15Sep 7, 2024Updated last year
BaofengZan / yolov5_2.0-TensorRt
View on GitHub
U版yolov5 2.0的tensorrt加速
☆37Aug 3, 2020Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
hahnyuan / LLM-Viewer
View on GitHub
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline mod…
☆662Sep 11, 2024Updated last year
bytedance / AffineQuant
View on GitHub
Official implementation of the ICLR 2024 paper AffineQuant
☆30Mar 30, 2024Updated 2 years ago
casper-hansen / AutoAWQ_kernels
View on GitHub
☆80Nov 26, 2024Updated last year
LeiWang1999 / tvm_gpu_gemm
View on GitHub
play gemm with tvm
☆91Jul 22, 2023Updated 3 years ago
GATECH-EIC / HW-NAS-Bench
View on GitHub
[ICLR 2021] HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark
☆118Apr 18, 2023Updated 3 years ago
levipereira / triton-server-yolo
View on GitHub
This repository serves as an example of deploying the YOLO models on Triton Server for performance and testing purposes
☆71Oct 20, 2025Updated 9 months ago
WeijieMax / LSSInst
View on GitHub
[3DV 2025] LSSInst: Improving Geometric Modeling in LSS-Based BEV Perception with Instance Representation
☆21Sep 25, 2025Updated 10 months ago
NVIDIA / TensorRT-Edge-LLM
View on GitHub
High-performance, light-weight C++ LLM and VLM Inference Software for Physical AI
☆483Updated this week
morsoli / llmbenchmark
View on GitHub
大模型API性能指标比较 - 深入分析TTFT、TPS等关键指标
☆20Sep 12, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
tensorchord / llmspec
View on GitHub
OpenAI compatible API for open source LLMs
☆16Oct 30, 2023Updated 2 years ago
xzyun2011 / wulewule
View on GitHub
基于InterLM的《黑神话：悟空》AI小助手，了解更多背后的故事--在更新视频中
☆36Jan 4, 2025Updated last year
jundaf2 / INT8-Flash-Attention-FMHA-Quantization
View on GitHub
☆165Sep 15, 2023Updated 2 years ago
anseeto / jetson-gpu-burn
View on GitHub
Multi-GPU CUDA stress test
☆30Nov 23, 2023Updated 2 years ago
pointpillars-on-openvino / pointpillars-on-openvino
View on GitHub
☆12Dec 16, 2021Updated 4 years ago
bentoml / BentoLMDeploy
View on GitHub
Self-host LLMs with LMDeploy and BentoML
☆22Jul 14, 2026Updated last week
OpenPPL / ppq
View on GitHub
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
☆1,807Mar 28, 2024Updated 2 years ago
wangzyon / pyInfer
View on GitHub
async inference for machine learning model
☆26Sep 21, 2022Updated 3 years ago
dusty-nv / NanoLLM
View on GitHub
Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector…
☆379Oct 18, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
OAID / TengineInferPipe
View on GitHub
☆23Dec 8, 2022Updated 3 years ago
leimao / Nsight-Compute-Docker-Image
View on GitHub
Nsight Compute In Docker
☆13Dec 21, 2023Updated 2 years ago
agno-agi / ai-app
View on GitHub
☆12May 23, 2024Updated 2 years ago
BaofengZan / my_trt_pro
View on GitHub
跟着Tensorrt_pro学习各种知识
☆39Nov 25, 2022Updated 3 years ago
LeiWang1999 / Stream-k.tvm
View on GitHub
☆20Sep 28, 2024Updated last year
jukindle / aslaug3d_simulation
View on GitHub
OpenAI Gym environment for RoyalPanda, a Clearpath Ridgeback base with a Franka Emika manipulator.
☆13Jul 13, 2020Updated 6 years ago
hopef / llama3_chat
View on GitHub
Llama3 Streaming Chat Sample
☆22Apr 24, 2024Updated 2 years ago