ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback
☆121Sep 20, 2025Updated 5 months ago
Alternatives and similar repositories for ComfyMind
Users that are interested in ComfyMind are comparing it to the libraries listed below
Sorting:
- Official implementation of "Motion Dreamer: Realizing Physically Coherent Video Generation through Scene-Aware Motion Reasoning"☆16Jan 22, 2025Updated last year
- [WACV2025] Linking Omni-Depth with View Synthesis through Multi-Sphere Image aided Generalizable Neural Radiance Field☆14Nov 3, 2024Updated last year
- UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture☆94Feb 5, 2026Updated 3 weeks ago
- ☆95Feb 4, 2026Updated 3 weeks ago
- Controlnet module for Wan2.2☆42Oct 30, 2025Updated 4 months ago
- ComfyUI custom node implementation of VideoMaMa for video matting with mask conditioning.☆34Feb 9, 2026Updated 3 weeks ago
- Not All Steps are Created Equal: Selective Diffusion Distillation for Image Manipulation (ICCV 2023)☆66Sep 28, 2023Updated 2 years ago
- The official implementation of StereoPilot☆101Dec 19, 2025Updated 2 months ago
- Code of our paper "A Unified Agentic Framework for Evaluating Conditional Image Generation".☆30Jul 22, 2025Updated 7 months ago
- [ICCV 2025] Code & Data for: SuperEdit - Rectifying and Facilitating Supervision for Instruction-Based Image Editing☆164Jun 26, 2025Updated 8 months ago
- ☆16Sep 4, 2025Updated 5 months ago
- The official implementation of OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows☆122Aug 16, 2025Updated 6 months ago
- VideoAuteur: Towards Long Narrative Video Generation☆43Oct 22, 2025Updated 4 months ago
- An Efficient Text-to-Image Generation Pretrain Pipeline☆130Apr 18, 2025Updated 10 months ago
- Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models" [IEEE ICASSP 202…☆29Jan 18, 2026Updated last month
- ☆11Jun 28, 2024Updated last year
- ☆40Updated this week
- LVAS-Agent Code Base☆22Apr 15, 2025Updated 10 months ago
- VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning☆270Apr 15, 2025Updated 10 months ago
- FLM-Audio is a audio-language subversion of RoboEgo/FLM-Ego -- an omnimodal model with native full duplexity.☆62Dec 9, 2025Updated 2 months ago
- ☆71Nov 24, 2025Updated 3 months ago
- ☆95Mar 3, 2025Updated last year
- ☆19Dec 20, 2025Updated 2 months ago
- CVPR 2023: PAniC-3D, rendering☆15Mar 25, 2023Updated 2 years ago
- A ComfyUI node for transforming images into descriptive text using templated visual question answering. Leverages Hugging Face's VQA mode…☆12Apr 1, 2025Updated 11 months ago
- [CVPR 2025] Official implementation of "Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation"☆291May 24, 2025Updated 9 months ago
- A largescale 3D caricature dataset that contains high-quality diversified 3D caricatures manually crafted by professional artists.☆62Sep 10, 2022Updated 3 years ago
- [CVPR 2026] Official repo of "MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing“☆76Updated this week
- PICABench: How Far Are We from Physically Realistic Image Editing?☆36Nov 5, 2025Updated 3 months ago
- This is a ComfyUI node that integrates pruna☆66Sep 8, 2025Updated 5 months ago
- TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning☆114Dec 24, 2025Updated 2 months ago
- Official code for "FaceCom: Towards High-fidelity 3D Facial Shape Completion via Optimization and Inpainting Guidance", CVPR 2024.☆20Sep 11, 2024Updated last year
- Artifact evaluation of MobiSys25 SynCheck☆19Mar 24, 2025Updated 11 months ago
- Official implementation of "LOCATEdit: Graph Laplacian Optimized Cross Attention for Localized Guided Image Editing☆15May 27, 2025Updated 9 months ago
- An official implementation of EvoSearch: Scaling Image and Video Generation via Test-Time Evolutionary Search☆100Oct 3, 2025Updated 5 months ago
- ☆29Aug 19, 2025Updated 6 months ago
- Prompt Generator for Video, Audio, Image, and Text. A node for ComfyUI. Including Deepseek, Alibaba Cloud Qwen, Google Gemini, and locall…☆53Jul 11, 2025Updated 7 months ago
- Official implementation of “LucidFusion: Reconstructing 3D Gaussians with Arbitrary Unposed Images”☆74Mar 21, 2025Updated 11 months ago
- Official implementation of the paper: [EMNLP 2025] RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruct…☆21Dec 9, 2025Updated 2 months ago