MikaStars39 / StableMask
PyTorch implementation of StableMask (ICML'24)
☆12Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for StableMask
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆32Updated last month
- DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models☆59Updated this week
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆29Updated last year
- code for ACL24 "MELoRA: Mini-Ensemble Low-Rank Adapter for Parameter-Efficient Fine-Tuning"☆15Updated 6 months ago
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆63Updated 10 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆27Updated last week
- CLIP-MoE: Mixture of Experts for CLIP☆17Updated last month
- [ATTRIB @ NeurIPS 2024 Oral] When Attention Sink Emerges in Language Models: An Empirical View☆29Updated last month
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆32Updated last month
- ☆77Updated 4 months ago
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆39Updated 3 months ago
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆26Updated 4 months ago
- ☆27Updated last year
- [NeurIPS2024] Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging☆36Updated this week
- Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.☆32Updated last year
- This repository includes the official implementation of our paper "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness …☆19Updated last year
- ☆30Updated this week
- Large Language Models Can Self-Improve in Long-context Reasoning☆36Updated last week
- Code for paper "Patch-Level Training for Large Language Models"☆72Updated last week
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆35Updated 3 weeks ago
- Code for "Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes"☆28Updated 7 months ago
- ✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio☆30Updated last month
- [EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokens☆21Updated last year
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆24Updated 4 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆33Updated last week
- [ACL 2023] Code for paper “Tailoring Instructions to Student’s Learning Levels Boosts Knowledge Distillation”(https://arxiv.org/abs/2305.…☆37Updated last year
- ☆24Updated last year
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆59Updated 5 months ago
- [ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"☆94Updated 7 months ago