lucywang720 / model-surgery
☆28Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for model-surgery
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆24Updated 4 months ago
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆57Updated 3 months ago
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆39Updated 3 months ago
- [Arxiv 2024] Adversarial attacks on multimodal agents☆40Updated 4 months ago
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆55Updated 3 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆98Updated 7 months ago
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆33Updated last month
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆68Updated 5 months ago
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆52Updated 3 weeks ago
- ☆12Updated last month
- A Survey on the Honesty of Large Language Models☆47Updated last month
- ☆54Updated 2 months ago
- [EMNLP Findings 2024 & ACL 2024 NLRSE Oral] Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewards☆44Updated 6 months ago
- An LLM-free Multi-dimensional Benchmark for Multi-modal Hallucination Evaluation☆93Updated 10 months ago
- The Paper List on Data Contamination for Large Language Models Evaluation.☆76Updated this week
- [ATTRIB @ NeurIPS 2024 Oral] When Attention Sink Emerges in Language Models: An Empirical View☆29Updated last month
- Official Code for Paper: Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications☆60Updated last month
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆75Updated last month
- An Easy-to-use Hallucination Detection Framework for LLMs.☆48Updated 7 months ago
- ☆90Updated 4 months ago
- Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning☆23Updated last week
- Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts☆19Updated 9 months ago
- Mosaic IT: Enhancing Instruction Tuning with Data Mosaics☆15Updated 4 months ago
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆84Updated 9 months ago
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆84Updated 6 months ago
- ☆29Updated last year
- Directional Preference Alignment☆51Updated 2 months ago
- Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"☆36Updated this week
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆45Updated 7 months ago