sieve-community / describeLinks
Incredibly descriptive audiovisual summaries for videos
☆41Updated 11 months ago
Alternatives and similar repositories for describe
Users that are interested in describe are comparing it to the libraries listed below
Sorting:
- A multi-modal AI Model that can generate high quality novel videos with text, images, or video clips.☆64Updated last year
- ☆31Updated last year
- ☆14Updated 7 months ago
- ☆46Updated last year
- ☆13Updated last year
- Gradio app to track objects in video and add visual effects☆17Updated 2 weeks ago
- ☆29Updated last year
- SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama: https://arxiv.org/abs/2408.09333v2☆123Updated 8 months ago
- Here we will track the latest AI Multimodal Models, including Multimodal Foundation Models, LLM, Agent, Audio, Image, Video, Music and 3D…☆36Updated 5 months ago
- ☆181Updated last month
- Command-line script for inferencing from models such as WizardCoder☆26Updated last year
- A one-stop library to standardize the inference and evaluation of all the conditional video generation models.☆48Updated 5 months ago
- Website source code for our ACM MM'23 paper "Hierarchical Masked 3D Diffusion Model for Video Outpainting".☆41Updated last year
- FLUX.1-dev LoRA Outfit Generator can create an outfit by detailing the color, pattern, fit, style, material, and type.☆67Updated 8 months ago
- ☆25Updated last year
- Community ComfyUI workflows running on fal.ai☆58Updated 10 months ago
- Implementation for the paper "ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems".☆177Updated 4 months ago
- ☆79Updated last year
- ☆12Updated last year
- [IJCV'24] AutoStory: Generating Diverse Storytelling Images with Minimal Human Effort☆151Updated 7 months ago
- LCM LoRA☆37Updated last year
- ☆13Updated last year
- Official GPU implementation of the paper "PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance"☆131Updated 7 months ago
- ☆202Updated last year
- Offical Code for GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation☆141Updated 8 months ago
- Fine-tune of Florence-2 for shot categorization.☆26Updated 4 months ago
- Gradio UI for a Cog API☆69Updated last year
- A gradio webui for Andrewyng translation-agent☆29Updated 7 months ago
- Live2Diff: A Pipeline that processes Live video streams by a uni-directional video Diffusion model.☆186Updated 11 months ago
- ☆55Updated last year