WesleyHsieh0806 / Amodal-Expander
Official Training and Inference Code of Amodal Expander, Proposed in Tracking Any Object Amodally
☆17Updated 10 months ago
Alternatives and similar repositories for Amodal-Expander:
Users that are interested in Amodal-Expander are comparing it to the libraries listed below
- ☆34Updated last year
- Official Pytorch Implementation of Self-emerging Token Labeling☆33Updated last year
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆37Updated last year
- ☆21Updated 11 months ago
- Code for the paper "Placing Objects in Context via Inpainting for Out-of-distribution Segmentation", ECCV 2024☆21Updated 8 months ago
- MMPD Dataset from ECCV'2024 "When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset"☆17Updated 9 months ago
- This repository is for the first survey on SAM & SAM2 for Videos.☆47Updated last week
- EfficientViT is a new family of vision models for efficient high-resolution vision.☆24Updated last year
- ☆28Updated 3 months ago
- Code release for the CVPR'23 paper titled "PartDistillation Learning part from Instance Segmentation"☆58Updated last year
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆26Updated last year
- The official repository for the RealSyn dataset☆32Updated last week
- Official Pytorch implementation for "IFORMER: INTEGRATING CONVNET AND TRANSFORMER FOR MOBILE APPLICATION" [ICLR 2025]☆42Updated last month
- Official code repository for paper: "ExPLoRA: Parameter-Efficient Extended Pre-training to Adapt Vision Transformers under Domain Shifts"☆31Updated 7 months ago
- Official PyTorch implementation of "No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding"☆32Updated 11 months ago
- ☆52Updated 2 years ago
- ☆23Updated 6 months ago
- Official Pytorch implementation for Distilling Image Classifiers in Object detection (NeurIPS2021)☆31Updated 3 years ago
- Code For Our Work: DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries [ECCV-2024]☆14Updated 10 months ago
- [ICME 2023] FlowText: Synthesizing Realistic Scene Text Video with Optical Flow Estimation☆11Updated last year
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆42Updated 9 months ago
- ☆29Updated 2 years ago
- [ICME 2022] code for the paper, SimVit: Exploring a simple vision transformer with sliding windows.☆68Updated 2 years ago
- TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"☆33Updated 3 years ago
- Odd-One-Out: Anomaly Detection by Comparing with Neighbors (CVPR25)☆37Updated 5 months ago
- SAM-CLIP module for use with Autodistill.☆15Updated last year
- Training with Product Digital Twins for AutoRetail Checkout☆18Updated last year
- ☆10Updated 6 months ago
- [ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".☆27Updated last year
- "Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs" 2023☆14Updated 5 months ago