showlab / sparseformer
(ICLR 2024, CVPR 2024) SparseFormer
β67Updated 2 months ago
Alternatives and similar repositories for sparseformer:
Users that are interested in sparseformer are comparing it to the libraries listed below
- π₯ [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"β31Updated 7 months ago
- β57Updated last year
- β58Updated last year
- β52Updated last year
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"β32Updated last month
- This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Modelβ91Updated 6 months ago
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effectβ¦β35Updated 7 months ago
- [ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".β24Updated 11 months ago
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentationβ35Updated last year
- [NeurIPS 2024] Official implementation of the paper "Interfacing Foundation Models' Embeddings"β118Updated 4 months ago
- Official Pytorch codebase for Open-Vocabulary Instance Segmentation without Manual Mask Annotations [CVPR 2023]β49Updated last week
- Official repository of paper: "FeatAug-DETR: Enriching One-to-Many Matching for DETRs with Feature Augmentation"β24Updated last year
- [ECCV 2024] ControlCap: Controllable Region-level Captioningβ61Updated 2 months ago
- [CVPR 2023] RILS: Masked Visual Reconstruction in Language Semantic Space (https://arxiv.org/abs/2301.06958)β44Updated last year
- Code Release of F-LMM: Grounding Frozen Large Multimodal Modelsβ60Updated 5 months ago
- Official code for "Opening up Open World Tracking" (CVPR 2022)β55Updated last year
- Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)β84Updated 2 years ago
- [ECCV 2024] Official implementation of the paper "Towards Latent Masked Image Modeling for Self-Supervised Visual Representation Learningβ¦β23Updated 5 months ago
- The offical implemention of JM3D.β28Updated last year
- β29Updated 9 months ago
- (ICCV 2023) Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentationβ46Updated 6 months ago
- IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Modelβ26Updated last month
- β25Updated 10 months ago
- Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)β31Updated last year
- Code release for "SegLLM: Multi-round Reasoning Segmentation"β56Updated last week
- β103Updated 7 months ago
- β37Updated 2 years ago
- [AAAI 2024] Referred by Multi-Modality: A Unified Temporal Transformers for Video Object Segmentationβ73Updated 6 months ago
- (ECCV 2024) Can OOD Object Detectors Learn from Foundation Models?β23Updated last month