mithunparab / text2segment_videoLinks
Simple Video Summarization using Text-to-Segment Anything (Florence2 + SAM2) This project provides a video processing tool that utilizes advanced AI models, specifically Florence2 and SAM2, to detect and segment specific objects or activities in a video based on textual descriptions.
☆10Updated 7 months ago
Alternatives and similar repositories for text2segment_video
Users that are interested in text2segment_video are comparing it to the libraries listed below
Sorting:
- ☆47Updated last year
- Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models☆14Updated last year
- Passively collect images for computer vision datasets on the edge.☆35Updated last year
- ☆29Updated last year
- Use Florence 2 to auto-label data for use in training fine-tuned object detection models.☆67Updated last year
- Playground Web UI using segment-anything-2 models from the Meta.☆53Updated 10 months ago
- Incredibly descriptive audiovisual summaries for videos☆41Updated last year
- Real-Time Open-Vocabulary Object Detection☆12Updated last year
- Gradio app to track objects in video and add visual effects☆17Updated 2 months ago
- wav2lip-api☆11Updated 2 years ago
- FLUX.1-dev LoRA Outfit Generator can create an outfit by detailing the color, pattern, fit, style, material, and type.☆69Updated 11 months ago
- Official Repo For THE Paper “StyleTailor: Towards Personalized Fashion Styling via Hierarchical Negative Feedback”☆19Updated last month
- ImageSlider custom component for gradio.☆42Updated last year
- Flask-based web application designed to compare text and image embeddings using the CLIP model.☆22Updated last year
- Diffusers Image Fill v3 -- Inpaint or Remove objects from an image - or Outpaint - or Outpaint Video Zoom: 16GB+ GPU | 32GB+ RAM | 20GB+…☆14Updated 11 months ago
- Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)☆17Updated last year
- Get up and running with Llama 3, Mistral, Gemma, and other large language models.☆30Updated 2 weeks ago
- Simple CogVLM client script☆14Updated last year
- ☆68Updated 5 months ago
- ☆24Updated last year
- ☆40Updated last year
- ☆16Updated last year
- Video Diffusion WebUI: Text2Video + Image2Video + Video2Video WebUI☆65Updated last year
- ☆15Updated 2 months ago
- ☆13Updated last year
- optimized wav2lip☆18Updated last year
- Cog wrapper for FalconsAi / nsfw_image_detection☆16Updated 2 months ago
- Summarize Youtube Videos and Generate Timestamps Efficiently using LLM [Google Gemini Pro, OpenAI ChatGPT]☆82Updated last week
- ☆79Updated last year
- Community ComfyUI workflows running on fal.ai☆58Updated last year