MikaStars39 / StableMaskLinks
PyTorch implementation of StableMask (ICML'24)
☆13Updated last year
Alternatives and similar repositories for StableMask
Users that are interested in StableMask are comparing it to the libraries listed below
Sorting:
- [EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokens☆24Updated last year
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆38Updated 9 months ago
- ☆18Updated 6 months ago
- Code for paper "Patch-Level Training for Large Language Models"☆85Updated 8 months ago
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆25Updated 6 months ago
- Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆40Updated 3 weeks ago
- ☆90Updated 2 months ago
- Open-Pandora: On-the-fly Control Video Generation☆34Updated 7 months ago
- ☆50Updated last year
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆39Updated last week
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆46Updated 4 months ago
- ☆51Updated last week
- ☆48Updated last month
- Code and Model for NeurIPS 2024 Spotlight Paper "Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training…☆42Updated 8 months ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆38Updated 4 months ago
- The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search"☆25Updated 2 months ago
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆23Updated 8 months ago
- FocusLLM: Scaling LLM’s Context by Parallel Decoding☆41Updated 7 months ago
- [MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501☆56Updated 11 months ago
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆52Updated 5 months ago
- [ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"☆95Updated last year
- [NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models☆47Updated 2 months ago
- (ICLR2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆35Updated 2 weeks ago
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆24Updated 9 months ago
- Model Merging with SVD to Tie the KnOTS [ICLR 2025]☆59Updated 3 months ago
- Official implementation of ECCV24 paper: POA☆24Updated 11 months ago
- Preference Learning for LLaVA☆46Updated 8 months ago
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆31Updated 2 years ago
- ☆42Updated 8 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆27Updated 2 months ago