Ravi-Teja-konda / Surveillance_Video_Summarizer
VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vision-Language Model. Includes a Gradio-based interface for querying and analyzing video footage.
☆110Updated 8 months ago
Alternatives and similar repositories for Surveillance_Video_Summarizer
Users that are interested in Surveillance_Video_Summarizer are comparing it to the libraries listed below
Sorting:
- ☆72Updated last week
- ☆204Updated 11 months ago
- Embed anything.☆29Updated 11 months ago
- 🐮📢 The first AI voice assistant that interrupts *you*☆144Updated 8 months ago
- Use the Moondream 2 model to detect faces and their gaze directions in videos.☆39Updated 4 months ago
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…☆80Updated 11 months ago
- Gradio based tool to run opensource LLM models directly from Huggingface☆91Updated 10 months ago
- Rank LLMs, RAG systems, and prompts using automated head-to-head evaluation☆103Updated 5 months ago
- Chat with Phi 3.5/3 Vision LLMs. Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which includ…☆33Updated 4 months ago
- AutoNL - Natural Language Automation tool☆85Updated last year
- Using the moondream VLM with optical flow for promptable object tracking☆54Updated 2 months ago
- This project involves using llamaindex Multi Agents concierge system and Qdrant vector database to customize the RAG application with use…☆50Updated 8 months ago
- Use Florence 2 to auto-label data for use in training fine-tuned object detection models.☆63Updated 9 months ago
- JacQues is a Dash-based interactive web application that facilitates real-time chat and document management.☆22Updated 8 months ago
- Agentic RAG to help you build a startup🚀☆41Updated last month
- Structured information extraction from documents☆314Updated 7 months ago
- ☆39Updated last year
- ☆112Updated 5 months ago
- Jockey is a conversational video agent.☆76Updated 3 months ago
- ☆130Updated 2 weeks ago
- This repository stores the source code for the Mistral Hackathon 2024 in Paris☆16Updated 8 months ago
- ☆28Updated last year
- An extension of the previous 'Fitness-AI-Coach': a complete web application with real-time exercise recognition and counting. The exercis…☆80Updated 3 months ago
- Rag Chatbot React And Tyepscript base boilerplate☆33Updated last year
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆93Updated 4 months ago
- Unofficial implementation and experiments related to Set-of-Mark (SoM) 👁️☆86Updated last year
- Which model is the best at object detection? Which is best for small or large objects? We compare the results in a handy leaderboard.☆70Updated this week
- Repo of the code from the Medium article☆20Updated 11 months ago
- An integration of Segment Anything Model, Molmo, and, Whisper to segment objects using voice and natural language.☆25Updated 2 months ago
- ☆68Updated 7 months ago