WHU-ZQH / FSAM4PLM
[EMNLP22] Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models
☆21Updated last year
Related projects ⓘ
Alternatives and complementary repositories for FSAM4PLM
- Source code of COLING 2022 paper "A Contrastive Cross-channel Data Augmentation Framework for Aspect-based Sentiment Analysis"☆20Updated last year
- [ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning☆44Updated 6 months ago
- The official repo for the technical report "Scalable Mask Annotation for Video Text Spotting"☆17Updated last year
- Dataset pruning for ImageNet and LAION-2B.☆69Updated 4 months ago
- Retrieval-augmented Image Captioning☆12Updated last year
- UniTAB: Unifying Text and Box Outputs for Grounded VL Modeling, ECCV 2022 (Oral Presentation)☆84Updated last year
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆43Updated 5 months ago
- ☆29Updated 2 years ago
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆47Updated last year
- Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models☆67Updated 4 months ago
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models☆25Updated 11 months ago
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"☆53Updated last year
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆29Updated 7 months ago
- On the Effectiveness of Parameter-Efficient Fine-Tuning☆38Updated last year
- ☆33Updated 6 months ago
- SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)☆78Updated last year
- ☆84Updated 11 months ago
- Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models (ACL-Findings 2024)☆13Updated 7 months ago
- ☆25Updated 2 weeks ago
- 🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)☆63Updated 11 months ago
- [NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…☆43Updated 3 months ago
- PyTorch Implementation of "Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Larg…☆17Updated 2 months ago
- ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities☆33Updated 2 months ago
- 🎁[ChatGPT4NLU] A Comparative Study on ChatGPT and Fine-tuned BERT☆193Updated last year
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆33Updated last week
- EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive Pruning (ACL 2023)☆22Updated last year
- MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…☆45Updated last month
- Source code of EMNLP 2022 Findings paper "SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters"☆19Updated 7 months ago
- This repo is the official implementation of UPL (Unsupervised Prompt Learning for Vision-Language Models).☆106Updated 2 years ago
- [NeurIPS'24] GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching☆20Updated 7 months ago