NVIDIA-AI-Blueprints / video-search-and-summarizationLinks
Blueprint for Ingesting massive volumes of live or archived videos and extract insights for summarization and interactive Q&A
☆184Updated last month
Alternatives and similar repositories for video-search-and-summarization
Users that are interested in video-search-and-summarization are comparing it to the libraries listed below
Sorting:
- Collection of reference workflows for building intelligent agents with NIMs☆167Updated 6 months ago
- Inference and fine-tuning examples for vision models from 🤗 Transformers☆158Updated 3 months ago
- This NVIDIA RAG blueprint serves as a reference solution for a foundational Retrieval Augmented Generation (RAG) pipeline.☆212Updated 2 weeks ago
- ☆160Updated last week
- Inference, Fine Tuning and many more recipes with Gemma family of models☆262Updated 3 weeks ago
- Multimodal AI agent with Llama 3.2: A Streamlit app that processes text, images, PDFs, and PPTs, integrating NIM microservices, Milvus, a…☆122Updated 10 months ago
- Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector…☆304Updated 9 months ago
- Customizable, AI-driven virtual assistant designed to streamline customer service operations, handle common inquiries, and improve overal…☆164Updated 3 weeks ago
- ☆36Updated this week
- ☆113Updated 8 months ago
- ☆290Updated this week
- Which model is the best at object detection? Which is best for small or large objects? We compare the results in a handy leaderboard.☆84Updated last week
- Ultralytics Notebooks 🚀☆97Updated last week
- Chat with Phi 3.5/3 Vision LLMs. Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which includ…☆34Updated 7 months ago
- Fine tune Gemma 3 on an object detection task☆74Updated 3 weeks ago
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…☆82Updated last year
- Using the moondream VLM with optical flow for promptable object tracking☆68Updated 5 months ago
- ☆180Updated 5 months ago
- This repo has the code of the 3 demos I presented at Google Gemma2 DevDay Tokyo, using Gemma2 on a Jetson Orin Nano device.☆56Updated 3 weeks ago
- VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 V…☆119Updated 2 months ago
- Python SDK for Llama Stack☆171Updated last month
- NVIDIA AI Blueprint for multimodal PDF data extraction for enterprise RAG☆345Updated 4 months ago
- From scratch implementation of a vision language model in pure PyTorch☆234Updated last year
- A tutorial introducing knowledge distillation as an optimization technique for deployment on NVIDIA Jetson☆206Updated last year
- Context-Aware RAG library for Knowledge Graph ingestion and retrieval functions.☆27Updated last week
- Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench☆178Updated 3 months ago
- META‑AGENTIC α‑AGI 👁️✨ — Mission 🎯 End‑to‑end: Identify 🔍 → Out‑Learn 📚 → Out‑Think 🧠 → Out‑Design 🎨 → Out‑Strategise ♟️ → Out‑Exe…☆251Updated this week
- An NVIDIA AI Workbench example project for Retrieval Augmented Generation (RAG)☆335Updated 2 months ago
- The NVIDIA RTX™ AI Toolkit is a suite of tools and SDKs for Windows developers to customize, optimize, and deploy AI models across RTX PC…☆168Updated 8 months ago
- Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engine☆477Updated 2 weeks ago