yule-BUAA / MergeLLMLinks
Codes for Merging Large Language Models
☆31Updated 9 months ago
Alternatives and similar repositories for MergeLLM
Users that are interested in MergeLLM are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆59Updated 3 months ago
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆38Updated 11 months ago
- ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of…☆21Updated this week
- ☆15Updated last month
- ☆29Updated last year
- A Sober Look at Language Model Reasoning☆52Updated last week
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆85Updated 7 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆38Updated 7 months ago
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆13Updated 11 months ago
- ☆15Updated last month
- Official code for SEAL: Steerable Reasoning Calibration of Large Language Models for Free☆25Updated last month
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆25Updated 6 months ago
- DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling☆32Updated 10 months ago
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆38Updated last year
- ☆18Updated 6 months ago
- Official repository for paper "DeepCritic: Deliberate Critique with Large Language Models"☆24Updated 2 weeks ago
- code for Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning☆16Updated 10 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 7 months ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆60Updated 5 months ago
- ☆45Updated last month
- [EMNLP 2023, Main Conference] Sparse Low-rank Adaptation of Pre-trained Language Models☆76Updated last year
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆69Updated 3 months ago
- Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping☆41Updated 2 weeks ago
- Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024☆22Updated 11 months ago
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆82Updated 7 months ago
- ☆19Updated 3 months ago
- ThinK: Thinner Key Cache by Query-Driven Pruning☆18Updated 3 months ago
- Code for paper "Parameter Efficient Multi-task Model Fusion with Partial Linearization"☆21Updated 8 months ago
- ☆22Updated 11 months ago
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆45Updated 7 months ago