mithunparab / text2segment_video
Simple Video Summarization using Text-to-Segment Anything (Florence2 + SAM2) This project provides a video processing tool that utilizes advanced AI models, specifically Florence2 and SAM2, to detect and segment specific objects or activities in a video based on textual descriptions.
☆9Updated last month
Alternatives and similar repositories for text2segment_video:
Users that are interested in text2segment_video are comparing it to the libraries listed below
- ☆29Updated last year
- ☆46Updated last year
- Passively collect images for computer vision datasets on the edge.☆31Updated last year
- Gradio app to track objects in video and add visual effects☆16Updated 6 months ago
- wav2lip-api☆11Updated 2 years ago
- Simple CogVLM client script☆14Updated last year
- MetaCLIP module for use with Autodistill.☆21Updated last year
- SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution☆14Updated last year
- Python scripts performing optical flow estimation using the NeuFlowV2 model in ONNX.☆41Updated 6 months ago
- Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models☆15Updated last year
- ☆12Updated 2 months ago
- ☆13Updated 3 months ago
- Diffusers Image Fill v3 -- Inpaint or Remove objects from an image - or Outpaint - or Outpaint Video Zoom: 16GB+ GPU | 32GB+ RAM | 20GB+…☆12Updated 4 months ago
- ☆12Updated 5 months ago
- ☆30Updated last year
- Playground Web UI using segment-anything-2 models from the Meta.☆46Updated 3 months ago
- Cog wrapper for FalconsAi / nsfw_image_detection☆16Updated last year
- [WIP] AI Try-On plugin for Chrome☆27Updated last year
- Orchestrating AI for stunning lip-synced videos. Effortless workflow, exceptional results, all in one place.☆68Updated 9 months ago
- Use Florence 2 to auto-label data for use in training fine-tuned object detection models.☆62Updated 7 months ago
- A minimalistic, hackable code base to finetune Wan video generation model☆37Updated last week
- ComfyUI YOLO-World Integration☆41Updated 8 months ago
- SadTalker gradio_demo.py file with code section that allows you to set the eye blink and pose reference videos for the software to use wh…☆11Updated last year
- EdgeSAM model for use with Autodistill.☆26Updated 9 months ago
- ☆30Updated last year
- ☆12Updated last year
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆13Updated last year
- ImageSlider custom component for gradio.☆40Updated 10 months ago
- Using open-source LLM Llama2 by Meta on local CPU inference for document question-and-answer☆15Updated last year
- ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing☆67Updated 10 months ago