Ravi-Teja-konda / Surveillance_Video_SummarizerLinks
VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vision-Language Model. Includes a Gradio-based interface for querying and analyzing video footage.
โ125Updated 4 months ago
Alternatives and similar repositories for Surveillance_Video_Summarizer
Users that are interested in Surveillance_Video_Summarizer are comparing it to the libraries listed below
Sorting:
- ๐ฎ๐ข The first AI voice assistant that interrupts *you*โ148Updated last year
- โ105Updated this week
- Using the moondream VLM with optical flow for promptable object trackingโ72Updated 8 months ago
- Daily Research Bot helps you stay on top of new AI-related research and updates. Currently supports: `huggingface.co/papers` and `hype.reโฆโ46Updated 11 months ago
- Embed anything.โ27Updated last year
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectioโฆโ84Updated last year
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorchโ102Updated 10 months ago
- Gradio based tool to run opensource LLM models directly from Huggingfaceโ96Updated last year
- An extension of the previous 'Fitness-AI-Coach': a complete web application with real-time exercise recognition and counting. The exercisโฆโ112Updated 3 months ago
- Tiny client for LLMs with vision and tool calling. As simple as it gets.โ88Updated 10 months ago
- โ207Updated last year
- Inference and fine-tuning examples for vision models from ๐ค Transformersโ162Updated 2 months ago
- Jockey is a conversational video agent.โ89Updated 5 months ago
- Structured Data Extractor for AI Agents. Search your documents or the web for specific data and get it back in JSON or Markdown in a singโฆโ179Updated 7 months ago
- โ121Updated last month
- Rank LLMs, RAG systems, and prompts using automated head-to-head evaluationโ105Updated 10 months ago
- Unofficial implementation and experiments related to Set-of-Mark (SoM) ๐๏ธโ87Updated 2 years ago
- โ30Updated 10 months ago
- [NeurIPS VLM workshop 2024] In-Context Ensemble Learning from Pseudo Labels Improves Video-Language Models for Low-Level Workflow Understโฆโ23Updated 7 months ago
- โ102Updated last year
- โ84Updated last year
- Structured information extraction from documentsโ317Updated last year
- No longer maintained:Your personal ArXiv Curatorโ41Updated 11 months ago
- โ133Updated 6 months ago
- Use the Moondream 2 model to detect faces and their gaze directions in videos.โ46Updated 9 months ago
- JacQues is a Dash-based interactive web application that facilitates real-time chat and document management.โ22Updated last year
- run paligemma in real timeโ133Updated last year
- Structured pruning and bias visualization for Large Language Models. Tools for LLM optimization and fairness analysis.โ23Updated 2 weeks ago
- Serving LLMs in the HF-Transformers format via a PyFlask APIโ71Updated last year
- Benchmarking Vision-Language Models on OCR tasks in Dynamic Video Environmentsโ45Updated 8 months ago