mithunparab / text2segment_videoLinks
Simple Video Summarization using Text-to-Segment Anything (Florence2 + SAM2) This project provides a video processing tool that utilizes advanced AI models, specifically Florence2 and SAM2, to detect and segment specific objects or activities in a video based on textual descriptions.
☆10Updated 4 months ago
Alternatives and similar repositories for text2segment_video
Users that are interested in text2segment_video are comparing it to the libraries listed below
Sorting:
- ☆46Updated last year
- ☆29Updated last year
- Incredibly descriptive audiovisual summaries for videos☆41Updated 11 months ago
- Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models☆15Updated last year
- Passively collect images for computer vision datasets on the edge.☆34Updated last year
- This project breathes life into video characters by using AI to describe their personality and then chat with you as them.☆47Updated last year
- ☆31Updated last year
- Gradio app to track objects in video and add visual effects☆17Updated 2 weeks ago
- Chinese Stable Diffusion, zh SD,中文文生图,中文SD,中文Stable Diffusion☆49Updated last year
- Playground Web UI using segment-anything-2 models from the Meta.☆54Updated 7 months ago
- [WIP] AI Try-On plugin for Chrome☆27Updated last year
- Use Florence 2 to auto-label data for use in training fine-tuned object detection models.☆64Updated 11 months ago
- Our idea is to combine the power of computer vision model and LLMs. We use YOLO, CLIP and DINOv2 to extract high-level features from imag…☆116Updated 2 years ago
- Real-Time Open-Vocabulary Object Detection☆13Updated last year
- wav2lip-api☆11Updated 2 years ago
- ☆16Updated last year
- FLUX.1-dev LoRA Outfit Generator can create an outfit by detailing the color, pattern, fit, style, material, and type.☆67Updated 8 months ago
- Cog wrapper for moondream2☆13Updated 11 months ago
- 💡💡💡awesome compute vision app in gradio☆53Updated last year
- Orchestrating AI for stunning lip-synced videos. Effortless workflow, exceptional results, all in one place.☆73Updated 3 weeks ago
- Image Prompter for Gradio☆92Updated last year
- ☆31Updated last year
- ☆13Updated 7 months ago
- MetaCLIP module for use with Autodistill.☆21Updated last year
- ☆40Updated last year
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆13Updated last year
- Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, DINOv2, ImageBind, model…☆36Updated last year
- ☆55Updated last year
- Diffusers Image Fill v3 -- Inpaint or Remove objects from an image - or Outpaint - or Outpaint Video Zoom: 16GB+ GPU | 32GB+ RAM | 20GB+…☆13Updated 8 months ago
- ☆24Updated last year