mithunparab / text2segment_videoLinks
Simple Video Summarization using Text-to-Segment Anything (Florence2 + SAM2) This project provides a video processing tool that utilizes advanced AI models, specifically Florence2 and SAM2, to detect and segment specific objects or activities in a video based on textual descriptions.
☆10Updated 11 months ago
Alternatives and similar repositories for text2segment_video
Users that are interested in text2segment_video are comparing it to the libraries listed below
Sorting:
- Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models☆15Updated 2 years ago
- Incredibly descriptive audiovisual summaries for videos☆41Updated last year
- ☆47Updated last year
- Real-Time Open-Vocabulary Object Detection☆12Updated 2 years ago
- ☆72Updated 2 months ago
- ☆12Updated last year
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆12Updated 2 years ago
- ImageSlider custom component for gradio.☆43Updated last year
- Gradio app to track objects in video and add visual effects☆17Updated 6 months ago
- ☆29Updated 2 years ago
- ☆12Updated 2 years ago
- ☆16Updated 6 months ago
- Playground Web UI using segment-anything-2 models from the Meta.☆56Updated last year
- ☆17Updated 2 years ago
- ☆15Updated last year
- Use Florence 2 to auto-label data for use in training fine-tuned object detection models.☆69Updated last year
- ☆148Updated last month
- ☆15Updated last year
- Diffusers Image Fill v3 -- Inpaint or Remove objects from an image - or Outpaint - or Outpaint Video Zoom: 16GB+ GPU | 32GB+ RAM | 20GB+…☆16Updated last year
- ☆24Updated last year
- ☆25Updated 2 years ago
- An AI try-on application for generating photos with AI character wearing the same clothes as the one in the input photo.☆14Updated 2 years ago
- ☆15Updated 2 years ago
- Cog wrapper for FalconsAi / nsfw_image_detection☆18Updated 6 months ago
- FLUX.1-dev LoRA Outfit Generator can create an outfit by detailing the color, pattern, fit, style, material, and type.☆69Updated last year
- A one-stop library to standardize the inference and evaluation of all the conditional video generation models.☆51Updated 11 months ago
- This project breathes life into video characters by using AI to describe their personality and then chat with you as them.☆49Updated last year
- AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, an…☆18Updated 3 months ago
- ☆40Updated 2 years ago
- Orchestrating AI for stunning lip-synced videos. Effortless workflow, exceptional results, all in one place.☆75Updated 7 months ago