JohannesTheo / trapped-in-texture-bias
Official code release for the paper Trapped in texture bias? A large scale comparison of deep instance segmentation, accepted at ECCV 2022
☆14Updated 8 months ago
Related projects: ⓘ
- ☆32Updated 8 months ago
- AAPL: Adding Attributes to Prompt Learning for Vision-Language Models (CVPRw 2024)☆29Updated 4 months ago
- A benchmark dataset and simple code examples for measuring the perception and reasoning of multi-sensor Vision Language models.☆15Updated 3 weeks ago
- Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"☆30Updated last month
- Official Pytorch Implementation of Self-emerging Token Labeling☆30Updated 5 months ago
- Official code repo of PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs☆22Updated 3 months ago
- [ECCV2024] ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation☆45Updated 2 weeks ago
- ☆25Updated last year
- Code for our ICLR 2024 paper "PerceptionCLIP: Visual Classification by Inferring and Conditioning on Contexts"☆76Updated 4 months ago
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆36Updated last month
- ☆30Updated 7 months ago
- ☆17Updated last month
- Official Pytorch Implementation of Paper "A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Des…☆47Updated 2 months ago
- How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges☆30Updated 11 months ago
- REVO-LION: Evaluating and Refining Vision-Language Instruction Tuning Datasets☆11Updated 11 months ago
- ☆24Updated last year
- Project for "LaSagnA: Language-based Segmentation Assistant for Complex Queries".☆43Updated 4 months ago
- [CBMI2024] Official repository of the paper "Is CLIP the main roadblock for fine-grained open-world perception?".☆17Updated 2 months ago
- ☆25Updated 3 weeks ago
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆30Updated 6 months ago
- Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.☆84Updated 5 months ago
- MIMIC: Masked Image Modeling with Image Correspondences☆15Updated 3 months ago
- Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…☆18Updated 10 months ago
- 🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"☆23Updated 3 months ago
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆28Updated 2 months ago
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆15Updated last week
- SMILE: A Multimodal Dataset for Understanding Laughter☆13Updated last year
- Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024…☆22Updated last month
- ☆36Updated 4 months ago