dusty-nv / NanoLLMLinks
Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.
☆335Updated last year
Alternatives and similar repositories for NanoLLM
Users that are interested in NanoLLM are comparing it to the libraries listed below
Sorting:
- ☆118Updated this week
- A tutorial introducing knowledge distillation as an optimization technique for deployment on NVIDIA Jetson☆223Updated 2 years ago
- A reference application for a local AI assistant with LLM and RAG☆117Updated last year
- Quick start scripts and tutorial notebooks to get started with TAO Toolkit☆125Updated last month
- A collection of reference AI microservices and workflows for Jetson Platform Services☆50Updated 10 months ago
- ☆106Updated last month
- A project that optimizes OWL-ViT for real-time inference with NVIDIA TensorRT.☆383Updated 10 months ago
- Collection of reference workflows for building intelligent agents with NIMs☆178Updated 10 months ago
- Blueprint for Ingesting massive volumes of live or archived videos and extract insights for summarization and interactive Q&A☆342Updated last month
- A utility library to help integrate Python applications with Metropolis Microservices for Jetson☆15Updated 11 months ago
- Zero-copy multimodal vector DB with CUDA and CLIP/SigLIP☆63Updated 7 months ago
- TAO Toolkit deep learning networks with PyTorch backend☆106Updated this week
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆300Updated last year
- This repo has the code of the 3 demos I presented at Google Gemma2 DevDay Tokyo, using Gemma2 on a Jetson Orin Nano device.☆60Updated 4 months ago
- Creation of annotated datasets from scratch using Generative AI and Foundation Computer Vision models☆130Updated 2 months ago
- From scratch implementation of a vision language model in pure PyTorch☆251Updated last year
- A project demonstrating how to make DeepStream docker images.☆88Updated 2 months ago
- High-performance, optimized pre-trained template AI application pipelines for systems using Hailo devices☆169Updated 2 months ago
- Simple and unified interface to zero-shot computer vision models curated for robotics use cases.☆162Updated 2 months ago
- Real-time Vision Language Model interaction via webcam - WebRTC-based web interface☆141Updated 2 weeks ago
- The jetson-examples repository by Seeed Studio offers a seamless, one-line command deployment to run vision AI and Generative AI models o…☆234Updated 5 months ago
- An innovative library for efficient LLM inference via low-bit quantization☆350Updated last year
- A Toolkit to Help Optimize Onnx Model☆267Updated this week
- ASR/NLP/TTS deep learning inference library for NVIDIA Jetson using PyTorch and TensorRT☆218Updated last year
- A reference example for integrating NanoOwl with Metropolis Microservices for Jetson☆30Updated last year
- TinyChatEngine: On-Device LLM Inference Library☆931Updated last year
- Advanced quantization toolkit for LLMs and VLMs. Native support for WOQ, MXFP4, NVFP4, GGUF, Adaptive Bits and seamless integration with …☆735Updated this week
- ☆20Updated 8 months ago
- Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function ind…☆102Updated last year
- The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment☆560Updated last week