Visual self-questioning for large vision-language assistant.
☆44Jul 23, 2025Updated 11 months ago
Alternatives and similar repositories for SQ-LLaVA
Users that are interested in SQ-LLaVA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- VidKV: Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models☆26Mar 26, 2025Updated last year
- Official repository for paper MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning(https://arxiv.org/abs/2406.17770).☆160Sep 27, 2024Updated last year
- [ECCV2024]FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance☆18Sep 11, 2024Updated last year
- Rui Qian, Xin Yin, Chuanhang Deng, et al.: UGround: Towards Unified Visual Grounding with Unrolled Transformers (ICML 2026)☆26Jun 18, 2026Updated last week
- ☆65Jun 16, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆23Sep 3, 2020Updated 5 years ago
- This is a collection of awesome papers I have read (carefully or roughly) in the fields of computer vision, machine learning, pattern rec…☆31Aug 8, 2024Updated last year
- Preference Learning for LLaVA☆59Nov 9, 2024Updated last year
- [ICDM 2023] Momentum is All You Need for Data-Driven Adaptive Optimization☆26Mar 30, 2024Updated 2 years ago
- Pytorch implementation for Egoinstructor at CVPR 2024☆28Dec 1, 2024Updated last year
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆40Jul 29, 2023Updated 2 years ago
- [ECCV 2024] ControlCap: Controllable Region-level Captioning☆81Oct 25, 2024Updated last year
- Official Repository for CVPR 2024 Paper: "Large Language Models are Good Prompt Learners for Low-Shot Image Classification"☆45Jul 1, 2024Updated last year
- Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning☆20Dec 21, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆15Feb 27, 2025Updated last year
- BEV-LGKD: A Unified LiDAR-Guided Knowledge Distillation Framework for Multi-View BEV 3D Object Detection☆13Apr 2, 2024Updated 2 years ago
- [CVPR 2025] DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models☆112Nov 22, 2025Updated 7 months ago
- Official repo of M$^2$PT: Multimodal Prompt Tuning for Zero-shot Instruction Learning☆28Mar 23, 2025Updated last year
- HD-EPIC Python script to download the entire datasets or parts of it☆22Oct 7, 2025Updated 8 months ago
- Hyperbolic Safety-Aware Vision-Language Models. CVPR 2025☆31Apr 8, 2025Updated last year
- Python logging package for easy reproducible experimenting in research☆41Jul 29, 2025Updated 11 months ago
- [EMNLP'2024 Findings] Explore generated documents for enhanced IR with LLMs. We enhance BM25 to surpass strong dense retriever on many da…☆14Mar 28, 2025Updated last year
- ☆10Nov 16, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Official code for "Can We Talk Models Into Seeing the World Differently?" (ICLR 2025).☆30Jan 26, 2025Updated last year
- [ICLR'21] Neural Pruning via Growing Regularization (PyTorch)☆82Jul 15, 2021Updated 4 years ago
- [ICLR2025 Spotlight] Advantage-Guided Distillation for Preference Alignment in Small Language Models☆26Feb 10, 2025Updated last year
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention☆68Jul 16, 2024Updated last year
- One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models☆60Apr 25, 2026Updated 2 months ago
- This repository is the official data collection of MMFundus (Multimodal Fundus) dataset.☆13Feb 2, 2026Updated 4 months ago
- [MICCAI-2023]Visual-Attribute Prompt Learning for Progressive Mild Cognitive Impairment Prediction☆15Dec 12, 2023Updated 2 years ago
- [EMNLP 2024 Main] Code for the paper "Dissecting Fine-Tuning Unlearning in Large Language Models"☆14Oct 10, 2024Updated last year
- Progressive Language-guided Visual Learning for Multi-Task Visual Grounding☆13May 9, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts☆338Jul 17, 2024Updated last year
- ☆54Jan 17, 2025Updated last year
- [Preprint] Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Prunin…☆41Sep 9, 2025Updated 9 months ago
- 🌋👵🏻 Yo'LLaVA: Your Personalized Language and Vision Assistant (NeurIPS 2024)☆123Mar 26, 2025Updated last year
- The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.☆38Dec 30, 2025Updated 6 months ago
- MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments☆13Jul 8, 2024Updated last year
- An official repo for WACV 2025 paper "LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spa…☆31Jan 27, 2025Updated last year