kyegomez / BRAVE-ViT-Swarm
Implementation of the paper: "BRAVE : Broadening the visual encoding of vision-language models"
☆26Updated this week
Alternatives and similar repositories for BRAVE-ViT-Swarm:
Users that are interested in BRAVE-ViT-Swarm are comparing it to the libraries listed below
- ☆37Updated 8 months ago
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models☆75Updated 7 months ago
- [NeurIPS 2024] Official implementation of the paper "Interfacing Foundation Models' Embeddings"☆123Updated 7 months ago
- ☆45Updated 3 months ago
- [CVPR 2024 Highlight] OpenBias: Open-set Bias Detection in Text-to-Image Generative Models☆23Updated 2 months ago
- [ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…☆98Updated 11 months ago
- Official PyTorch Implementation for Task Vectors are Cross-Modal☆22Updated 4 months ago
- ☆64Updated 2 months ago
- 🤖 [ICLR'25] Multimodal Video Understanding Framework (MVU)☆32Updated 2 months ago
- Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.☆98Updated 3 weeks ago
- Official implementation of the paper The Hidden Language of Diffusion Models☆72Updated last year
- Official Pytorch Implementation of Paper "A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Des…☆55Updated 9 months ago
- [ECCV 2024] Official Release of SILC: Improving vision language pretraining with self-distillation☆42Updated 6 months ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆71Updated 3 weeks ago
- Official implementation of "Describing Differences in Image Sets with Natural Language" (CVPR 2024 Oral)☆118Updated last year
- Evaluation and dataset construction code for the CVPR 2025 paper "Vision-Language Models Do Not Understand Negation"☆20Updated last week
- ☆32Updated last year
- [NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"☆16Updated 6 months ago
- Official code repo of PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs☆26Updated 3 months ago
- Diffusion Models as Data Mining Tools☆53Updated last month
- An open source implementation of CLIP (With TULIP Support)☆125Updated 3 weeks ago
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆101Updated 7 months ago
- [ICLR 2024] Official code for the paper "LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts"☆76Updated 11 months ago
- Official repository of paper "Subobject-level Image Tokenization"☆69Updated 2 weeks ago
- ☆37Updated 7 months ago
- ☆43Updated last year
- Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models".☆102Updated 10 months ago
- [NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration☆23Updated 6 months ago
- [NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"☆104Updated last year
- This is a public repository for Image Clustering Conditioned on Text Criteria (IC|TC)☆87Updated last year