Trustworthy-ML-Lab / ThinkEditLinks
An effective weight-editing method for mitigating overly short reasoning in LLMs, and a mechanistic study uncovering how reasoning length is encoded in the model’s representation space.
☆12Updated 3 weeks ago
Alternatives and similar repositories for ThinkEdit
Users that are interested in ThinkEdit are comparing it to the libraries listed below
Sorting:
- Official code for SEAL: Steerable Reasoning Calibration of Large Language Models for Free☆25Updated last month
- The official repository of paper "AdaR1: From Long-Cot to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization"☆15Updated last month
- A Sober Look at Language Model Reasoning☆52Updated last week
- SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities☆15Updated 2 months ago
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆13Updated 11 months ago
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆25Updated 6 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆85Updated 7 months ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆22Updated 3 months ago
- Codebase for decoding compressed trust.☆23Updated last year
- ☆49Updated last year
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆59Updated 3 months ago
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆45Updated 7 months ago
- Codes for Merging Large Language Models☆31Updated 9 months ago
- ☆26Updated last year
- ☆19Updated 3 months ago
- ☆22Updated last month
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Updated last month
- [ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"☆56Updated 5 months ago
- [ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…☆66Updated last week
- ☆36Updated 2 months ago
- This is the official code for the paper "Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturba…☆27Updated 2 months ago
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆59Updated last year
- ☆15Updated 9 months ago
- [ACL 2024 main] Aligning Large Language Models with Human Preferences through Representation Engineering (https://aclanthology.org/2024.…☆25Updated 8 months ago
- AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning☆35Updated 3 weeks ago
- Code for paper "Merging Multi-Task Models via Weight-Ensembling Mixture of Experts"☆24Updated 11 months ago
- ☆24Updated last month
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆44Updated 7 months ago
- [NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"☆35Updated 4 months ago
- ☆4Updated 4 months ago