Visual self-questioning for large vision-language assistant.
☆45Jul 23, 2025Updated 8 months ago
Alternatives and similar repositories for SQ-LLaVA
Users that are interested in SQ-LLaVA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official repository for paper MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning(https://arxiv.org/abs/2406.17770).☆159Sep 27, 2024Updated last year
- [ECCV2024]FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance☆17Sep 11, 2024Updated last year
- UGround: Towards Unified Visual Grounding with Unrolled Transformers☆22Feb 15, 2026Updated last month
- ☆65Jun 16, 2025Updated 9 months ago
- This is a collection of awesome papers I have read (carefully or roughly) in the fields of computer vision, machine learning, pattern rec…☆26Aug 8, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- [ICDM 2023] Momentum is All You Need for Data-Driven Adaptive Optimization☆26Mar 30, 2024Updated 2 years ago
- [ECCV 2024] ControlCap: Controllable Region-level Captioning☆80Oct 25, 2024Updated last year
- Preference Learning for LLaVA☆59Nov 9, 2024Updated last year
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆40Jul 29, 2023Updated 2 years ago
- Pytorch implementation for Egoinstructor at CVPR 2024☆28Dec 1, 2024Updated last year
- Explainable Face Recognition ECCV 2020 Paper code and dataset repository☆65Apr 28, 2021Updated 4 years ago
- Official Repository for CVPR 2024 Paper: "Large Language Models are Good Prompt Learners for Low-Shot Image Classification"☆42Jul 1, 2024Updated last year
- Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning☆20Dec 21, 2023Updated 2 years ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆28Aug 9, 2025Updated 8 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆15Feb 27, 2025Updated last year
- BEV-LGKD: A Unified LiDAR-Guided Knowledge Distillation Framework for Multi-View BEV 3D Object Detection☆14Apr 2, 2024Updated 2 years ago
- [NeurIPS-2021] Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation☆43Mar 24, 2023Updated 3 years ago
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"☆18Mar 15, 2024Updated 2 years ago
- [CVPR 2025] DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models☆105Nov 22, 2025Updated 4 months ago
- Official repo of M$^2$PT: Multimodal Prompt Tuning for Zero-shot Instruction Learning☆27Mar 23, 2025Updated last year
- HD-EPIC Python script to download the entire datasets or parts of it☆19Oct 7, 2025Updated 6 months ago
- The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.☆28Dec 30, 2025Updated 3 months ago
- Python logging package for easy reproducible experimenting in research☆41Jul 29, 2025Updated 8 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Code for NeurIPS 2022 Datasets and Benchmarks paper - EgoTaskQA: Understanding Human Tasks in Egocentric Videos.☆43Apr 17, 2023Updated 2 years ago
- [EMNLP'2024 Findings] Explore generated documents for enhanced IR with LLMs. We enhance BM25 to surpass strong dense retriever on many da…☆15Mar 28, 2025Updated last year
- Official code for "Can We Talk Models Into Seeing the World Differently?" (ICLR 2025).☆28Jan 26, 2025Updated last year
- [ICLR'21] Neural Pruning via Growing Regularization (PyTorch)☆81Jul 15, 2021Updated 4 years ago
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention☆65Jul 16, 2024Updated last year
- One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models☆59Dec 20, 2024Updated last year
- This repository is the official data collection of MMFundus (Multimodal Fundus) dataset.☆13Feb 2, 2026Updated 2 months ago
- [EMNLP 2024 Main] Code for the paper "Dissecting Fine-Tuning Unlearning in Large Language Models"☆15Oct 10, 2024Updated last year
- Progressive Language-guided Visual Learning for Multi-Task Visual Grounding☆13May 9, 2025Updated 11 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts☆338Jul 17, 2024Updated last year
- ☆120Jan 27, 2025Updated last year
- ☆54Jan 17, 2025Updated last year
- [AAAI 2024] VSFormer: Visual-Spatial Fusion Transformer for Correspondence Pruning☆14Apr 7, 2024Updated 2 years ago
- 🌋👵🏻 Yo'LLaVA: Your Personalized Language and Vision Assistant (NeurIPS 2024)☆123Mar 26, 2025Updated last year
- This is the official repository for the paper "Modeling Human Gaze Behavior with Diffusion Models for Unified Scanpath Prediction". ICCV …☆25Dec 4, 2025Updated 4 months ago
- MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments☆13Jul 8, 2024Updated last year