guoweiyu / Logic-in-FramesLinks
Logic-in-frames: Dynamic keyframe search via visual semantic-logical verification for long video understanding
☆32Updated last week
Alternatives and similar repositories for Logic-in-Frames
Users that are interested in Logic-in-Frames are comparing it to the libraries listed below
Sorting:
- ☆223Updated 3 weeks ago
- [NeurIPS2024] MVGamba: Unify 3D Content Generation as State Space Sequence Modeling☆65Updated 11 months ago
- [AAAI 2026 Oral🔥] Official code for Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptati…☆72Updated last year
- [IROS2025] OpenGS-Fusion: Open-Vocabulary Dense Mapping with Hybrid 3D Gaussian Splatting for Refined Object-Level Understanding☆71Updated 3 months ago
- UR2: Unify RAG and Reasoning through Reinforcement Learning☆122Updated last week
- ☆288Updated last month
- [NeurIPS 2025] NAUTILUS: A Large Multimodal Model for Underwater Scene Understanding☆327Updated 3 weeks ago
- [EMNLP2025]Official implementation: Agent-style vision question answer in Autonomous Driving!☆131Updated 2 months ago
- [ICLR 2025] Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling☆82Updated 9 months ago
- [NeurIPS 2025] More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models☆214Updated 3 weeks ago
- Autoregressive Semantic Visual Reconstruction Helps VLMs Understand Better☆181Updated 5 months ago
- [ICML 2025 Poster] Official PyTorch Implementation of "Habitizing Diffusion Planning for Efficient and Effective Decision Making"☆35Updated 6 months ago
- Official implementation for "HA-VLN: A Benchmark for Human-Aware Navigation in Discrete-Continuous Environments with Dynamic Multi-Human …☆362Updated 3 weeks ago
- 🦎 Yo'Chameleon: Your Personalized Chameleon (CVPR 2025)☆148Updated 6 months ago
- Official repository of MMGenBench☆120Updated 8 months ago
- [ICCV‘25] Official implementation of paper "Towards Performance Consistency in Multi-Level Model Collaboration"☆42Updated last month
- Official code implementation of Context Cascade Compression: Exploring the Upper Limits of Text Compression☆63Updated last week
- ☆114Updated 3 months ago
- ☆55Updated 3 months ago
- Explain Before You Answer: A Survey on Compositional Visual Reasoning☆295Updated last month
- [AAAI 2026 🔥] Official implementation of "NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representation"☆172Updated 3 months ago
- AAAI'2026 Oral, FreeAskWorld is an interactive simulation framework that integrates large language models (LLMs) for high-level planning …☆52Updated last week
- CVPR2025☆44Updated 8 months ago
- Dynamic human image animation with strong identity preservation, heterogeneous character driving, and controllable backgrounds.☆139Updated 6 months ago
- Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dep…☆573Updated 3 months ago
- Official Implementation of Puzzles: Unbounded Video-Depth Augmentation for Scalable, End-to-End 3D Reconstruction.☆209Updated 2 months ago
- **Deep Video Discovery (DVD)** is a deep-research style question answering agent designed for understanding extra-long videos.☆309Updated 3 weeks ago
- Official Repo of "RobustFlow: Towards Robust Agentic Workflow Generation"☆227Updated last month
- ☆545Updated last month
- ☆247Updated 10 months ago