MikaStars39 / StableMask
PyTorch implementation of StableMask (ICML'24)
☆12Updated 8 months ago
Alternatives and similar repositories for StableMask:
Users that are interested in StableMask are comparing it to the libraries listed below
- (ICLR2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆26Updated 2 weeks ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆37Updated 5 months ago
- [ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation☆40Updated 3 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆55Updated last month
- ☆71Updated this week
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆33Updated 8 months ago
- Open-Pandora: On-the-fly Control Video Generation☆32Updated 3 months ago
- ☆36Updated this week
- Code for Findings of EMNLP2023 paper "Coarse-to-Fine Dual Encoders are Better Frame Identification Learners"☆12Updated last year
- The official repository for the paper "Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark"☆45Updated last month
- ☆39Updated 4 months ago
- Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.☆32Updated last year
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆72Updated last year
- Preference Learning for LLaVA☆40Updated 4 months ago
- ☆27Updated last year
- The official implementation of "LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation"☆13Updated last week
- [EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokens☆24Updated last year
- 🚀LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training☆73Updated 3 months ago
- Code for paper "Patch-Level Training for Large Language Models"☆81Updated 4 months ago
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆47Updated last year
- CLIP-MoE: Mixture of Experts for CLIP☆29Updated 5 months ago
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆44Updated last month
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆69Updated 4 months ago
- [NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…☆44Updated 7 months ago
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆14Updated 3 weeks ago
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆36Updated 11 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 5 months ago
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆65Updated 9 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆54Updated 5 months ago
- ☆21Updated 8 months ago