heliossun / SQ-LLaVAView external linksLinks
Visual self-questioning for large vision-language assistant.
☆45Jul 23, 2025Updated 6 months ago
Alternatives and similar repositories for SQ-LLaVA
Users that are interested in SQ-LLaVA are comparing it to the libraries listed below
Sorting:
- Self-training LLaVA for medical☆16Nov 3, 2024Updated last year
- Official Repository for CVPR 2024 Paper: "Large Language Models are Good Prompt Learners for Low-Shot Image Classification"☆41Jul 1, 2024Updated last year
- ☆11Jan 8, 2025Updated last year
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"☆18Mar 15, 2024Updated last year
- MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments☆13Jul 8, 2024Updated last year
- Official repository for paper MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning(https://arxiv.org/abs/2406.17770).☆159Sep 27, 2024Updated last year
- BEV-LGKD: A Unified LiDAR-Guided Knowledge Distillation Framework for Multi-View BEV 3D Object Detection☆14Apr 2, 2024Updated last year
- Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.☆13May 29, 2024Updated last year
- [EMNLP 2024 Main] Code for the paper "Dissecting Fine-Tuning Unlearning in Large Language Models"☆15Oct 10, 2024Updated last year
- This is a collection of awesome papers I have read (carefully or roughly) in the fields of computer vision, machine learning, pattern rec…☆14Aug 8, 2024Updated last year
- Preference Learning for LLaVA☆59Nov 9, 2024Updated last year
- [ICLR 2025] Official code repository for "TULIP: Token-length Upgraded CLIP"☆33Jan 26, 2026Updated 2 weeks ago
- ☆14Feb 27, 2025Updated 11 months ago
- [ECCV2024]FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance☆17Sep 11, 2024Updated last year
- PyTorch code for ECCV 2022 Oral paper "Modeling Mask Uncertainty in Hyperspectral Image Reconstruction"☆26Jul 23, 2022Updated 3 years ago
- [CVPR 2023] Adversarial Robustness via Random Projection Filters☆13Jun 20, 2023Updated 2 years ago
- ☆65Jun 16, 2025Updated 7 months ago
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆40Jul 29, 2023Updated 2 years ago
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆46Dec 1, 2024Updated last year
- Code repository for CVPR2024 paper 《Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness》☆25May 29, 2024Updated last year
- ☆19Jan 8, 2025Updated last year
- ☆21Aug 8, 2024Updated last year
- ☆17Mar 9, 2025Updated 11 months ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆26Aug 9, 2025Updated 6 months ago
- [CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts☆336Jul 17, 2024Updated last year
- Official repo of M$^2$PT: Multimodal Prompt Tuning for Zero-shot Instruction Learning☆27Mar 23, 2025Updated 10 months ago
- [ECCV 2024] ControlCap: Controllable Region-level Captioning☆80Oct 25, 2024Updated last year
- VidKV: Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models☆25Mar 26, 2025Updated 10 months ago
- 最简易的R1结果在小模型上的复现,阐述类O1与DeepSeek R1最重要的本质。Think is all your need。利用实验佐证,对于强推理能力 ,think思考过程性内容是AGI/ASI的核心。☆45Feb 8, 2025Updated last year
- Hyperbolic Safety-Aware Vision-Language Models. CVPR 2025☆31Apr 8, 2025Updated 10 months ago
- ☆21Apr 17, 2025Updated 9 months ago
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆85Oct 26, 2025Updated 3 months ago
- [CVPR 2024] PriViLege: Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners☆55Sep 5, 2024Updated last year
- Modality Gap–Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models☆48Updated this week
- Official code for "Can We Talk Models Into Seeing the World Differently?" (ICLR 2025).☆27Jan 26, 2025Updated last year
- [ICCV 2025] Deeply Supervised Flow-Based Generative Models☆27Jun 26, 2025Updated 7 months ago
- Pytorch implementation for Egoinstructor at CVPR 2024☆28Dec 1, 2024Updated last year
- One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models☆58Dec 20, 2024Updated last year
- A Simple and Efficient Reconstruction Backbone for Snapshot Compressive Imaging☆25Apr 18, 2023Updated 2 years ago