A benchmark that focuses on the sampling dilemma in long-video tasks. Through well-designed tasks, it evaluates the sampling efficiency of long-video VLMs. (ICCV2025)
☆27Aug 7, 2025Updated 7 months ago
Alternatives and similar repositories for LSDBench
Users that are interested in LSDBench are comparing it to the libraries listed below
Sorting:
- ImageNet3D: Towards General-Purpose Object-Level 3D Understanding☆20Dec 6, 2024Updated last year
- Official code for paper: F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Aggregative Gaussian Splatting☆50Mar 11, 2025Updated 11 months ago
- [Arxiv 2025] ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions☆45Jun 11, 2025Updated 8 months ago
- ☆20Oct 15, 2025Updated 4 months ago
- ☆14Sep 11, 2025Updated 5 months ago
- Toward Ambulatory Vision: Learning Visually-Grounded Active View Selection☆22Feb 5, 2026Updated last month
- [CVPR 2026] Thinking in 360°: Humanoid Visual Search in the Wild☆120Updated this week
- Official implementation of StochSync: a zero-shot approach for image generation in arbitrary spaces via stochastic diffusion synchronizat…☆21Jun 24, 2025Updated 8 months ago
- Unlocking Iterative Reasoning for Any Image Editor☆89Jan 18, 2026Updated last month
- [ICCV 2025] Official Implementation of RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model for Referring …☆18Jun 27, 2025Updated 8 months ago
- [NeurIPS 2025]《SD-VLM: Spatial Measuring and Understanding with Depth-encoded Vision Language Models》☆37Dec 29, 2025Updated 2 months ago
- ☆27Feb 27, 2025Updated last year
- Official implementation for the paper"Towards Understanding How Knowledge Evolves in Large Vision-Language Models"☆28Apr 10, 2025Updated 10 months ago
- [CVPR2025] Official code repository for SeTa: "Scale Efficient Training for Large Datasets"☆23Mar 18, 2025Updated 11 months ago
- [CVPR 2025] Test-Time Visual In-Context Tuning☆29Dec 31, 2025Updated 2 months ago
- [NeurIPS 2025] Official code for ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation☆33Oct 17, 2025Updated 4 months ago
- [NIPS24] Official Implementation of Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation☆20Oct 31, 2024Updated last year
- Training recipe for SpatialReasoner☆38Sep 21, 2025Updated 5 months ago
- Utilize the capability of GPT-4o Vision on the UHHGPT web portal☆12Aug 26, 2024Updated last year
- [NeurIPS 2024] Official Implementation of GrounDiT☆59Dec 12, 2024Updated last year
- [ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness☆67Jul 22, 2025Updated 7 months ago
- [CVPR 2025] Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis☆131May 16, 2025Updated 9 months ago
- Official implementation of the paper "Watermarking Autoregressive Image Generation" (NeurIPS'25)☆58Sep 19, 2025Updated 5 months ago
- Official PyTorch implementation of the paper "Equivariant Image Modeling"(https://arxiv.org/abs/2503.18948)☆35Aug 1, 2025Updated 7 months ago
- [CVPR2025] Official repository for "VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide"☆28May 27, 2025Updated 9 months ago
- [CVPR2025] BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding☆38Feb 5, 2026Updated last month
- Implementation of Prompting with the Future: Open-World Model Predictive Control with Interactive Digital Twins. [RSS 2025]☆49Oct 21, 2025Updated 4 months ago
- Official Repository of paper: "MotionEdit: Benchmarking and Learning Motion-Centric Image Editing"☆60Updated this week
- The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'☆204Nov 28, 2025Updated 3 months ago
- [NIPS 25'] Evaluation code of paper "KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models"☆40Oct 19, 2025Updated 4 months ago
- Scaling Spatial Intelligence with Multimodal Foundation Models☆177Feb 6, 2026Updated last month
- Official code for ICCV2023 paper: Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis☆34Dec 27, 2023Updated 2 years ago
- SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning☆104Jul 9, 2025Updated 7 months ago
- implementation of AnimateDiff.☆32Jul 14, 2023Updated 2 years ago
- [CVPR 2025] A Unified Image-Dense Annotation Generation Model for Underwater Scenes☆54Apr 9, 2025Updated 10 months ago
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆41Feb 12, 2025Updated last year
- Automate dating apps with AI☆19Jan 18, 2024Updated 2 years ago
- Code translator from one language to another using AI☆10Feb 24, 2026Updated last week
- Build an AI bot in Discord to serve user's personalized reports on what's up in tech☆28Sep 14, 2025Updated 5 months ago