WesleyHsieh0806 / Amodal-Expander
Official Training and Inference Code of Amodal Expander, Proposed in Tracking Any Object Amodally
☆14Updated 7 months ago
Alternatives and similar repositories for Amodal-Expander:
Users that are interested in Amodal-Expander are comparing it to the libraries listed below
- Official Pytorch Implementation of Self-emerging Token Labeling☆32Updated 10 months ago
- SAM-CLIP module for use with Autodistill.☆13Updated last year
- [NeurIPS2022] This is the official implementation of the paper "Expediting Large-Scale Vision Transformer for Dense Prediction without Fi…☆83Updated last year
- ☆34Updated last year
- Code release for the CVPR'23 paper titled "PartDistillation Learning part from Instance Segmentation"☆59Updated last year
- LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding☆17Updated last month
- arXiv 23 "Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs"☆14Updated 2 months ago
- Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, DINOv2, ImageBind, model…☆35Updated last year
- EfficientViT is a new family of vision models for efficient high-resolution vision.☆24Updated last year
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆24Updated last year
- This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135☆19Updated last year
- A Contrastive Learning Boost from Intermediate Pre-Trained Representations☆41Updated 5 months ago
- [ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…☆97Updated 9 months ago
- This repository is for the first survey on SAM for videos.☆32Updated 3 weeks ago
- [FGVC9-CVPR 2022] The second place solution for 2nd eBay eProduct Visual Search Challenge.☆26Updated 2 years ago
- ☆19Updated last year
- TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"☆33Updated 3 years ago
- A simple wrapper library for binding timm models as detectron2 backbones☆39Updated last year
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Updated 10 months ago
- Code for "AVG-LLaVA: A Multimodal Large Model with Adaptive Visual Granularity"☆24Updated 4 months ago
- ☆30Updated 2 months ago
- ☆41Updated last month
- ☆52Updated last year
- Official PyTorch implementation of "No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding"☆32Updated 9 months ago
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆34Updated 7 months ago
- [ICCV2023] TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance☆81Updated 7 months ago
- [CVPR 2023 Highlight] Beyond mAP: Towards better evaluation of instance segmentation☆26Updated last year
- EdgeSAM model for use with Autodistill.☆26Updated 8 months ago