☆40Jul 19, 2024Updated last year
Alternatives and similar repositories for visual_task_vectors
Users that are interested in visual_task_vectors are comparing it to the libraries listed below
Sorting:
- Public code repo for EMNLP 2024 Findings paper "MACAROON: Training Vision-Language Models To Be Your Engaged Partners"☆14Sep 28, 2024Updated last year
- [AAAI 2025] Official pytorch implementation of "Diffusion Model Patching via Mixture-of-Prompts"☆13Dec 12, 2024Updated last year
- Code Repository for the NeurIPS 2022 paper: "Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights".☆17Jul 10, 2024Updated last year
- [EMNLP 2024] SURf: Teaching Large Vision-Language Models to Selectively Utilize Retrieved Information☆12Oct 11, 2024Updated last year
- Separable Diffusion Model Unlearning☆13Jan 29, 2025Updated last year
- ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation☆27May 27, 2025Updated 9 months ago
- Minimal Implimentation of VCRec (2024) for collapse provention.☆18Jan 28, 2025Updated last year
- [IROS 2025] CRUISE: Cooperative Reconstruction and Editing in V2X Scenarios using Gaussian Splatting☆30Jul 25, 2025Updated 7 months ago
- This repository contains the code of our paper 'Skip \n: A simple method to reduce hallucination in Large Vision-Language Models'.☆15Feb 12, 2024Updated 2 years ago
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆39Mar 4, 2024Updated last year
- Ref-Diff: Zero-shot Referring Image Segmentation with Generative Models☆21May 29, 2025Updated 9 months ago
- The official implementation of 'GRID: Visual Layout Generation.'☆21Dec 28, 2024Updated last year
- Implementation and dataset for paper "Can MLLMs Perform Text-to-Image In-Context Learning?"☆47Jun 2, 2025Updated 9 months ago
- This repo is the official implementation of iSeg: An Iterative Refinement-based Framework for Training-free Segmentation.☆40Nov 25, 2024Updated last year
- iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models☆21Jan 29, 2025Updated last year
- Approximating the joint distribution of language models via MCTS☆22Nov 3, 2024Updated last year
- ☆19Feb 1, 2023Updated 3 years ago
- Video Diffusion State Space Models☆19Mar 27, 2024Updated last year
- Resources for Retrieval Augmentation for Commonsense Reasoning: A Unified Approach. EMNLP 2022.☆23Nov 23, 2022Updated 3 years ago
- Official implementation for the paper "Understanding Hyperdimensional Computing for Parallel Single-Pass Learning"☆23Jun 10, 2023Updated 2 years ago
- A probabilitic model for contextual word representation. Accepted to ACL2023 Findings.☆25Oct 22, 2023Updated 2 years ago
- Official implementation of Vector-ICL: In-context Learning with Continuous Vector Representations (ICLR 2025)☆21Jun 2, 2025Updated 9 months ago
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆86Oct 26, 2025Updated 4 months ago
- 【NeurIPS 2024】The implementation of LIVE: Learnable In-Context Vector for Visual Question Answering https://arxiv.org/abs/2406.13185☆22May 31, 2025Updated 9 months ago
- [ECCV-24] This is the official implementation of the paper "SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation".☆27Oct 13, 2024Updated last year
- 【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"☆22Sep 26, 2024Updated last year
- ☆25Jun 29, 2025Updated 8 months ago
- This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.or…☆161Sep 27, 2025Updated 5 months ago
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆58Sep 26, 2024Updated last year
- Automatic Integration for Neural Spatio-Temporal Point Process models (AI-STPP) is a new paradigm for exact, efficient, non-parametric inf…☆25Oct 14, 2024Updated last year
- simple and efficient baselines for practical semantic segmentation with plain ViTs☆20Mar 9, 2024Updated last year
- Neuron Activation☆26Nov 21, 2024Updated last year
- ☆28May 29, 2024Updated last year
- Code repo for "Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding"☆28Jul 31, 2024Updated last year
- Official PyTorch implementation of the paper "In-Context Learning Unlocked for Diffusion Models"☆413Mar 25, 2024Updated last year
- Official Implementation of "Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Func…☆29Dec 2, 2024Updated last year
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Mar 22, 2024Updated last year
- quick playground to animate pippin☆14Nov 11, 2024Updated last year
- Directional Preference Alignment☆58Sep 23, 2024Updated last year