NVIDIA-AI-Blueprints / video-search-and-summarizationLinks
Blueprint for Ingesting massive volumes of live or archived videos and extract insights for summarization and interactive Q&A
β97Updated last month
Alternatives and similar repositories for video-search-and-summarization
Users that are interested in video-search-and-summarization are comparing it to the libraries listed below
Sorting:
- Collection of reference workflows for building intelligent agents with NIMsβ158Updated 4 months ago
- Inference and fine-tuning examples for vision models from π€ Transformersβ147Updated last month
- Quick start scripts and tutorial notebooks to get started with TAO Toolkitβ85Updated 9 months ago
- This NVIDIA RAG blueprint serves as a reference solution for a foundational Retrieval Augmented Generation (RAG) pipeline.β105Updated last week
- Multimodal AI agent with Llama 3.2: A Streamlit app that processes text, images, PDFs, and PPTs, integrating NIM microservices, Milvus, aβ¦β116Updated 8 months ago
- β156Updated this week
- Which model is the best at object detection? Which is best for small or large objects? We compare the results in a handy leaderboard.β70Updated last week
- Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vectorβ¦β271Updated 7 months ago
- A reference example for integrating NanoOwl with Metropolis Microservices for Jetsonβ30Updated 11 months ago
- β96Updated 8 months ago
- β113Updated 6 months ago
- Fine tune Gemma 3 on an object detection taskβ43Updated this week
- Creation of annotated datasets from scratch using Generative AI and Foundation Computer Vision modelsβ118Updated 2 weeks ago
- Ultralytics Notebooks πβ80Updated last week
- Take your LLM to the optometrist.β26Updated last week
- Chat with Phi 3.5/3 Vision LLMs. Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which includβ¦β33Updated 5 months ago
- Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbenchβ165Updated last month
- Implementation of End-to-End YOLO Models for DeepStreamβ51Updated 7 months ago
- A tutorial introducing knowledge distillation as an optimization technique for deployment on NVIDIA Jetsonβ196Updated last year
- This repo has the code of the 3 demos I presented at Google Gemma2 DevDay Tokyo, using Gemma2 on a Jetson Orin Nano device.β45Updated last month
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectioβ¦β80Updated last year
- This repository provides optical character detection and recognition solution optimized on Nvidia devices.β75Updated 3 weeks ago
- VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vβ¦β114Updated 8 months ago
- Route LLM requests to the best model for the task at hand.β56Updated this week
- Notebooks for fine tuning pali gemmaβ107Updated last month
- A project that optimizes OWL-ViT for real-time inference with NVIDIA TensorRT.β337Updated 4 months ago
- From scratch implementation of a vision language model in pure PyTorchβ220Updated last year
- A utility library to help integrate Python applications with Metropolis Microservices for Jetsonβ13Updated 5 months ago
- Use Florence 2 to auto-label data for use in training fine-tuned object detection models.β64Updated 9 months ago
- TAO Toolkit deep learning networks with PyTorch backendβ95Updated 6 months ago