Repository for the paper: Teaching VLMs to Localize Specific Objects from In-context Examples
☆40Nov 27, 2024Updated last year
Alternatives and similar repositories for IPLoc
Users that are interested in IPLoc are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Dec 20, 2024Updated last year
- ☆12Apr 18, 2025Updated last year
- [MICCAI 2025] Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology☆12Jun 17, 2025Updated last year
- [CVPRW 2025] Official repository of paper titled "Towards Evaluating the Robustness of Visual State Space Models"☆26Jun 8, 2025Updated last year
- ☆11Oct 29, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Code for our paper: "Where's Waldo: Diffusion Features For Personalized Segmentation and Retrieval".☆14Feb 26, 2025Updated last year
- [ECCVW 2024 -- ORAL] Official repository of paper titled "Makeup-Guided Facial Privacy Protection via Untrained Neural Network Priors".☆12Oct 11, 2024Updated last year
- ☆10Nov 12, 2024Updated last year
- Code of paper "A Video Dataset for Falling Object Detection around Buildings" https://arxiv.org/abs/2408.05750☆20Jul 10, 2025Updated 11 months ago
- BESA is a differentiable weight pruning technique for large language models.☆17Mar 4, 2024Updated 2 years ago
- [EMNLP 2024] SURf: Teaching Large Vision-Language Models to Selectively Utilize Retrieved Information☆11Oct 11, 2024Updated last year
- ☆25Mar 25, 2025Updated last year
- Validating image classification benchmark results on ViTs and ResNets (v2)☆13Nov 3, 2022Updated 3 years ago
- 3D Mitochondria Instance Segmentation with Spatio-Temporal Transformers☆14Apr 17, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- super-resolution; post-training quantization; model compression☆14Nov 10, 2023Updated 2 years ago
- Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries" (ECCV 2024)☆188Jul 5, 2024Updated last year
- Associate Everything Detected: Facilitating Tracking-by-Detection to the Unknown☆41Feb 22, 2026Updated 4 months ago
- [EMNLP 2025 Findings] Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models☆148Aug 21, 2025Updated 10 months ago
- (ICCV 2023) Generative Multiplane Neural Radiance for 3D Aware Image Generation.☆18Sep 28, 2023Updated 2 years ago
- ☆21Oct 10, 2023Updated 2 years ago
- FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions☆55Apr 17, 2024Updated 2 years ago
- Code and data for ACL 2024 paper on 'Cross-Modal Projection in Multimodal LLMs Doesn't Really Project Visual Attributes to Textual Space'☆18Jul 21, 2024Updated last year
- Public code repo for EMNLP 2024 Findings paper "MACAROON: Training Vision-Language Models To Be Your Engaged Partners"☆14Sep 28, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [ICML 2025] QT-DOG: QUANTIZATION-AWARE TRAINING FOR DOMAIN GENERALIZATION☆25Nov 30, 2025Updated 7 months ago
- [CVPRW-25 MMFM] Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite fo…☆50Aug 23, 2024Updated last year
- Röttger et al. (2025): "MSTS: A Multimodal Safety Test Suite for Vision-Language Models"☆20Mar 31, 2025Updated last year
- [⭐ CVPR 2025 Highlight ⭐] Official Implementation of the paper STEREO: A Two-Stage Framework for Adversarially Robust Concept Erasing fro…☆31Apr 22, 2025Updated last year
- [NAACL'25] Contains code and documentation for our VANE-Bench paper.☆24Aug 19, 2025Updated 10 months ago
- ☆36Feb 5, 2024Updated 2 years ago
- Official Implementation of the paper "DifFSS: Diffusion Model for Few-Shot Semantic Segmentation"☆14Jul 26, 2023Updated 2 years ago
- Official PyTorch Implementation of MIANet: Aggregating Unbiased Instance and General Information for Few-Shot Semantic Segmentation(CVPR …☆30Mar 15, 2024Updated 2 years ago
- [BMVC 2025] Official Implementation of the paper "PerSense: Personalized Instance Segmentation in Dense Images"☆31Dec 18, 2025Updated 6 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Unified-modal Salient Object Detection via Adaptive Prompt Learning☆12Oct 17, 2025Updated 8 months ago
- This project explores Robotic Path Planning Using Diffusion Models (Janner et al., 2023) in 2D and 3D environments. The project was compl…☆14Feb 11, 2025Updated last year
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆68May 31, 2024Updated 2 years ago
- A dataset of scientific vector graphics in TikZ for training generative models.☆27Feb 4, 2026Updated 5 months ago
- General Navigation Models based on GNM, ViNT, NoMaD as a pytorch repo for quick and easy deployment☆15Nov 18, 2024Updated last year
- 🌋👵🏻 Yo'LLaVA: Your Personalized Language and Vision Assistant (NeurIPS 2024)☆123Mar 26, 2025Updated last year
- [EMNLP-2025 Oral] ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration☆91Nov 20, 2025Updated 7 months ago