baopj / Vid-MorpView external linksLinks
☆11Dec 6, 2024Updated last year
Alternatives and similar repositories for Vid-Morp
Users that are interested in Vid-Morp are comparing it to the libraries listed below
Sorting:
- [ECCV 2024] The first zero-shot setting for spatio-temporal video grounding.☆11Jul 16, 2024Updated last year
- ☆17Dec 25, 2023Updated 2 years ago
- [ICLR 2025] Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding☆40Mar 18, 2025Updated 10 months ago
- VLG-Net: Video-Language Graph Matching Networks for Video Grounding☆31May 31, 2022Updated 3 years ago
- The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)☆32Mar 29, 2024Updated last year
- DisTime: Distribution-based Time Representation for Video Large Language Models.☆18Jul 10, 2025Updated 7 months ago
- ☆46Sep 13, 2024Updated last year
- [ACL 2025] Official code for ''Learning to Reason from Feedback at Test-Time''.☆13May 16, 2025Updated 8 months ago
- ☆10May 18, 2024Updated last year
- ☆12Jul 4, 2024Updated last year
- [CVPR 2025] DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval☆22Jun 23, 2025Updated 7 months ago
- [CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".☆54May 25, 2025Updated 8 months ago
- Official imagej plugin of the "SACD" -v1.1.3☆12Feb 8, 2024Updated 2 years ago
- Simultaneous localization and mapping (SLAM) tools in 3D☆12Sep 3, 2024Updated last year
- ☆14Dec 2, 2025Updated 2 months ago
- F-16 is a powerful video large language model (LLM) that perceives high-frame-rate videos, which is developed by the Department of Electr…☆34Jul 3, 2025Updated 7 months ago
- Empowering Small VLMs to Think with Dynamic Memorization and Exploration☆15Nov 18, 2025Updated 2 months ago
- Weakly Supervised Referring Video Object Segmentation with Object-Centric Pseudo-Guidance☆10Aug 17, 2024Updated last year
- Code for "Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders" at ICML 2024☆10Sep 18, 2025Updated 4 months ago
- ☆14Dec 25, 2024Updated last year
- [ECCV 2024] Official PyTorch implementation of "Classification Matters: Improving Video Action Detection with Class-Specific Attention"☆16Nov 8, 2024Updated last year
- ☆15Oct 10, 2023Updated 2 years ago
- LLaVA-Next for STVG☆18Dec 5, 2025Updated 2 months ago
- The ROS package for QPEP-based ICP Mapping (Originated from https://github.com/ethz-asl/ethzasl_icp_mapping)☆14Oct 4, 2021Updated 4 years ago
- 3QFP: Efficient neural implicit surface reconstruction using Tri-Quadtrees and Fourier feature Positional encoding☆12Jan 20, 2026Updated 3 weeks ago
- ☆12Nov 25, 2023Updated 2 years ago
- Dummy project to test your Open3D build☆10May 6, 2021Updated 4 years ago
- Project for Deep Learning Methods for Calibrated Photometric Stereo and Beyond☆11Sep 23, 2024Updated last year
- This is a repository contains the implementation of our NeurIPS'24 paper "Temporal Sentence Grounding with Relevance Feedback in Videos"☆13Aug 22, 2025Updated 5 months ago
- ForensicsSAM: Toward Robust and Unified Image Forgery Detection and Localization Resisting to Adversarial Attack☆18Jan 25, 2026Updated 2 weeks ago
- [ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision☆12Sep 17, 2023Updated 2 years ago
- Certifiable solvers for the relative pose problem (RPp) with known gravity vector☆13Feb 16, 2023Updated 2 years ago
- Source code of paper: "HRegNet: A Hierarchical Network for Efficient and Accurate Outdoor LiDAR Point Cloud Registration".☆12Jan 12, 2022Updated 4 years ago
- ☆12May 22, 2023Updated 2 years ago
- Collection of papers about video-audio understanding☆22Dec 26, 2025Updated last month
- Learning Iterative Robust Transformation Synchronization☆15Nov 29, 2021Updated 4 years ago
- [ICCV 2025] AdsQA: Towards Advertisement Video Understanding Arxiv: https://arxiv.org/abs/2509.08621☆32Oct 30, 2025Updated 3 months ago
- ☆16May 15, 2024Updated last year
- SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability☆16May 8, 2025Updated 9 months ago