dusty-nv / NanoLLMLinks
Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.
☆284Updated 8 months ago
Alternatives and similar repositories for NanoLLM
Users that are interested in NanoLLM are comparing it to the libraries listed below
Sorting:
- A reference application for a local AI assistant with LLM and RAG☆112Updated 6 months ago
- ☆110Updated 2 months ago
- A tutorial introducing knowledge distillation as an optimization technique for deployment on NVIDIA Jetson☆198Updated last year
- A project that optimizes Whisper for low latency inference using NVIDIA TensorRT☆86Updated 8 months ago
- Quick start scripts and tutorial notebooks to get started with TAO Toolkit☆91Updated 9 months ago
- A utility library to help integrate Python applications with Metropolis Microservices for Jetson☆14Updated 6 months ago
- Collection of reference workflows for building intelligent agents with NIMs☆161Updated 5 months ago
- Blueprint for Ingesting massive volumes of live or archived videos and extract insights for summarization and interactive Q&A☆122Updated last month
- A collection of reference AI microservices and workflows for Jetson Platform Services☆40Updated 4 months ago
- This repo has the code of the 3 demos I presented at Google Gemma2 DevDay Tokyo, using Gemma2 on a Jetson Orin Nano device.☆46Updated 2 months ago
- A reference example for integrating NanoOwl with Metropolis Microservices for Jetson☆30Updated last year
- A project that optimizes OWL-ViT for real-time inference with NVIDIA TensorRT.☆339Updated 4 months ago
- ☆98Updated 9 months ago
- TAO Toolkit deep learning networks with PyTorch backend☆95Updated 7 months ago
- Simple and unified interface to zero-shot computer vision models curated for robotics use cases.☆138Updated 2 months ago
- High-performance, optimized pre-trained template AI application pipelines for systems using Hailo devices☆141Updated 3 months ago
- Zero-copy multimodal vector DB with CUDA and CLIP/SigLIP☆59Updated last month
- Creation of annotated datasets from scratch using Generative AI and Foundation Computer Vision models☆120Updated last month
- Build LLM-powered robots in your garage with MachinaScript For Robots!☆184Updated 8 months ago
- Quick exploration into fine tuning florence 2☆319Updated 9 months ago
- Advanced Quantization Algorithm for LLMs and VLMs, with support for CPU, Intel GPU, CUDA and HPU. Seamlessly integrated with Torchao, Tra…☆525Updated this week
- Beginner's Guide to reComputer Jetson☆109Updated 3 months ago
- llama.cpp (GGUF LLMs) and llava.cpp (GGUF VLMs) for ROS 2☆209Updated last week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆264Updated 8 months ago
- ☆236Updated 3 months ago
- ☆173Updated 2 years ago
- Utils for Unsloth☆99Updated this week
- A repository for affordable, easy-to-assemble robot arms designed for teleoperation applications.☆83Updated 4 months ago
- Control your SO-100 and SO-101 robot and train VLA AI robotics models☆180Updated last week
- OpenAI compatible API for TensorRT LLM triton backend☆209Updated 10 months ago