mithunparab / text2segment_videoLinks
Simple Video Summarization using Text-to-Segment Anything (Florence2 + SAM2) This project provides a video processing tool that utilizes advanced AI models, specifically Florence2 and SAM2, to detect and segment specific objects or activities in a video based on textual descriptions.
☆10Updated 5 months ago
Alternatives and similar repositories for text2segment_video
Users that are interested in text2segment_video are comparing it to the libraries listed below
Sorting:
- Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models☆15Updated last year
- ☆46Updated last year
- Incredibly descriptive audiovisual summaries for videos☆41Updated last year
- Passively collect images for computer vision datasets on the edge.☆35Updated last year
- Real-Time Open-Vocabulary Object Detection☆13Updated last year
- Use Florence 2 to auto-label data for use in training fine-tuned object detection models.☆65Updated 11 months ago
- ☆29Updated last year
- ☆24Updated last year
- Gradio app to track objects in video and add visual effects☆17Updated 2 weeks ago
- wav2lip-api☆11Updated 2 years ago
- Playground Web UI using segment-anything-2 models from the Meta.☆54Updated 8 months ago
- ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing☆69Updated last year
- 💡💡💡awesome compute vision app in gradio☆54Updated last year
- Simple CogVLM client script☆14Updated last year
- VideoDB Python SDK☆78Updated last week
- ☆31Updated 2 years ago
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆12Updated last year
- Community ComfyUI workflows running on fal.ai☆58Updated 11 months ago
- ☆79Updated last year
- ☆13Updated last year
- Image Prompter for Gradio☆92Updated last year
- ☆31Updated last year
- ImageSlider custom component for gradio.☆42Updated last year
- Object segmentation in collaboration with Segment Anyting Model and Yolov8☆25Updated 2 years ago
- AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, an…☆14Updated 3 weeks ago
- ☆14Updated 8 months ago
- Use Segment Anything 2, grounded with Florence-2, to auto-label data for use in training vision models.☆126Updated last year
- ☆33Updated last week
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆13Updated last year
- A multi-modal AI Model that can generate high quality novel videos with text, images, or video clips.☆64Updated last year