dusty-nv / NanoLLMLinks

Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.

☆327

Alternatives and similar repositories for NanoLLM

Users that are interested in NanoLLM are comparing it to the libraries listed below

Sorting:

NVIDIA-AI-IOT / jetson-generative-ai-playground
☆114Updated last week
NVIDIA / tao_tutorials
Quick start scripts and tutorial notebooks to get started with TAO Toolkit
☆114Updated 3 weeks ago
NVIDIA-AI-IOT / jetson-copilot
A reference application for a local AI assistant with LLM and RAG
☆116Updated 10 months ago
NVIDIA-AI-IOT / jetson-intro-to-distillation
A tutorial introducing knowledge distillation as an optimization technique for deployment on NVIDIA Jetson
☆217Updated last year
NVIDIA / metropolis-nim-workflows
Collection of reference workflows for building intelligent agents with NIMs
☆176Updated 9 months ago
NVIDIA-AI-IOT / nvidia-tao
☆103Updated this week
NVIDIA-AI-IOT / jetson-platform-services
A collection of reference AI microservices and workflows for Jetson Platform Services
☆50Updated 9 months ago
NVIDIA-AI-IOT / mmj_utils
A utility library to help integrate Python applications with Metropolis Microservices for Jetson
☆15Updated 10 months ago
NVIDIA-AI-IOT / mmj_genai
A reference example for integrating NanoOwl with Metropolis Microservices for Jetson
☆30Updated last year
asierarranz / Google_Gemma_DevDay
This repo has the code of the 3 demos I presented at Google Gemma2 DevDay Tokyo, using Gemma2 on a Jetson Orin Nano device.
☆58Updated 3 months ago
luxonis / datadreamer
Creation of annotated datasets from scratch using Generative AI and Foundation Computer Vision models
☆129Updated last month
NVIDIA-AI-Blueprints / video-search-and-summarization
Blueprint for Ingesting massive volumes of live or archived videos and extract insights for summarization and interactive Q&A
☆296Updated last week
NVIDIA-AI-IOT / deepstream_dockers
A project demonstrating how to make DeepStream docker images.
☆85Updated 3 weeks ago
dusty-nv / NanoDB
Zero-copy multimodal vector DB with CUDA and CLIP/SigLIP
☆61Updated 5 months ago
dusty-nv / jetson-voice
ASR/NLP/TTS deep learning inference library for NVIDIA Jetson using PyTorch and TensorRT
☆217Updated last year
dusty-nv / jetson-ai-lab
☆19Updated 6 months ago
NVIDIA / tao_pytorch_backend
TAO Toolkit deep learning networks with PyTorch backend
☆105Updated this week
qubvel / transformers-notebooks
Inference and fine-tuning examples for vision models from 🤗 Transformers
☆162Updated 2 months ago
Seeed-Projects / reComputer-Jetson-for-Beginners
Beginner's Guide to reComputer Jetson
☆115Updated 2 weeks ago
AviSoori1x / seemore
From scratch implementation of a vision language model in pure PyTorch
☆246Updated last year
NVIDIA / RTX-AI-Toolkit
The NVIDIA RTX™ AI Toolkit is a suite of tools and SDKs for Windows developers to customize, optimize, and deploy AI models across RTX PC…
☆175Updated 11 months ago
staghado / vit.cpp
Inference Vision Transformer (ViT) in plain C/C++ with ggml
☆295Updated last year
shahizat / JetsonGPT
Using FastChat-T5 Large Language Model, Vosk API for automatic speech recognition, and Piper for text-to-speech
☆126Updated 2 years ago
intel / neural-speed
An innovative library for efficient LLM inference via low-bit quantization
☆349Updated last year
hailo-ai / tappas
High-performance, optimized pre-trained template AI application pipelines for systems using Hailo devices
☆162Updated 3 weeks ago
mit-han-lab / TinyChatEngine
TinyChatEngine: On-Device LLM Inference Library
☆906Updated last year
NVIDIA / workbench-example-nemotron-finetune
An NVIDIA AI Workbench example project for fine-tuning a Nemotron-3 8B model
☆54Updated last year
autodistill / autodistill-florence-2
Use Florence 2 to auto-label data for use in training fine-tuned object detection models.
☆67Updated last year
intel / auto-round
Advanced Quantization Algorithm for LLMs and VLMs, with support for CPU, Intel GPU, CUDA and HPU.
☆679Updated this week
hailo-ai / hailort
An open source light-weight and high performance inference framework for Hailo devices
☆137Updated 3 weeks ago