mithunparab / text2segment_videoLinks
Simple Video Summarization using Text-to-Segment Anything (Florence2 + SAM2) This project provides a video processing tool that utilizes advanced AI models, specifically Florence2 and SAM2, to detect and segment specific objects or activities in a video based on textual descriptions.
☆10Updated 3 months ago
Alternatives and similar repositories for text2segment_video
Users that are interested in text2segment_video are comparing it to the libraries listed below
Sorting:
- ☆29Updated last year
- Gradio app to track objects in video and add visual effects☆16Updated 2 weeks ago
- Passively collect images for computer vision datasets on the edge.☆33Updated last year
- ☆46Updated last year
- ☆13Updated last year
- Face_lib separate from AI_Power☆25Updated last month
- [WIP] AI Try-On plugin for Chrome☆27Updated last year
- Diffusers Image Fill v3 -- Inpaint or Remove objects from an image - or Outpaint - or Outpaint Video Zoom: 16GB+ GPU | 32GB+ RAM | 20GB+…☆12Updated 6 months ago
- ☆32Updated last year
- ☆24Updated last year
- ☆12Updated 7 months ago
- ☆12Updated last year
- Project Page for VividTalk☆15Updated last year
- Use Florence 2 to auto-label data for use in training fine-tuned object detection models.☆64Updated 9 months ago
- ☆30Updated last year
- ☆15Updated 5 months ago
- Audio-Visual Lip Synthesis via Intermediate Landmark Representation☆17Updated 2 years ago
- ☆8Updated last year
- FLUX.1-dev LoRA Outfit Generator can create an outfit by detailing the color, pattern, fit, style, material, and type.☆64Updated 7 months ago
- SadTalker gradio_demo.py file with code section that allows you to set the eye blink and pose reference videos for the software to use wh…☆11Updated last year
- A simple c++ library to detect scene transitions in a video☆14Updated 5 years ago
- Incredibly descriptive audiovisual summaries for videos☆41Updated 10 months ago
- A multimodal large-scale model, which performs close to the closed-source Qwen-VL-PLUS on many datasets and significantly surpasses the p…☆14Updated last year
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆13Updated last year
- ☆30Updated 2 years ago
- GUI to sync video mouth movements to match audio, utilizing wav2lip-hq. Completed as part of a technical interview.☆11Updated last year
- ☆16Updated last year
- ☆24Updated last year
- MetaCLIP module for use with Autodistill.☆21Updated last year
- wav2lip-api☆11Updated 2 years ago