declare-lab / safety-arithmeticLinks
☆12Updated 6 months ago
Alternatives and similar repositories for safety-arithmetic
Users that are interested in safety-arithmetic are comparing it to the libraries listed below
Sorting:
- ☆15Updated 3 months ago
- [ACL 2025] An inference-time decoding strategy with adaptive foresight sampling☆99Updated 2 months ago
- The official implementation of "LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation"☆20Updated 3 months ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆27Updated 4 months ago
- EMNLP 2024: Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue☆35Updated last month
- Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".☆78Updated last year
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆38Updated 4 months ago
- [ACL 2025] Knowledge Unlearning for Large Language Models☆39Updated 2 months ago
- [EMNLP 2024] The official GitHub repo for the paper "Course-Correction: Safety Alignment Using Synthetic Preferences"☆19Updated 9 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆114Updated last year
- The official repository of 'Unnatural Language Are Not Bugs but Features for LLMs'☆21Updated 2 months ago
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"☆74Updated 2 months ago
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆52Updated 5 months ago
- DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling☆33Updated last year
- Code and Model for NeurIPS 2024 Spotlight Paper "Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training…☆42Updated 9 months ago
- Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]☆32Updated last year
- ☆33Updated last year
- [NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"☆38Updated last year
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆44Updated 3 months ago
- We introduce EMMET and unify model editing with popular algorithms ROME and MEMIT.☆24Updated 7 months ago
- [EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor☆29Updated last year
- ☆5Updated 5 months ago
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆27Updated 7 months ago
- Codebase for Instruction Following without Instruction Tuning☆35Updated 9 months ago
- Codes for Merging Large Language Models☆33Updated 11 months ago
- [ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"☆95Updated last year
- ☆18Updated 6 months ago
- Codebase for Math Neurosurgery: Isolating LLMs' Math Reasoning Abilities Using Only Forward Passes☆17Updated last month
- Official Code Repository for LM-Steer Paper: "Word Embeddings Are Steers for Language Models" (ACL 2024 Outstanding Paper Award)☆121Updated last week
- ☆50Updated last month