sail-sg / lm-random-memory-access
☆14Updated last year
Alternatives and similar repositories for lm-random-memory-access:
Users that are interested in lm-random-memory-access are comparing it to the libraries listed below
- Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning [ICML 2024]☆17Updated 11 months ago
- ☆38Updated last year
- Restore safety in fine-tuned language models through task arithmetic☆28Updated last year
- ☆34Updated 6 months ago
- ☆29Updated 11 months ago
- ☆13Updated last year
- The code of “Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning”☆16Updated last year
- Source codes for "Preference-grounded Token-level Guidance for Language Model Fine-tuning" (NeurIPS 2023).☆16Updated 3 months ago
- Align your LM to express calibrated verbal statements of confidence in its long-form generations.☆22Updated 10 months ago
- ☆25Updated 2 years ago
- ☆40Updated last year
- Official repository for ICLR 2024 Spotlight paper "Large Language Models Are Not Robust Multiple Choice Selectors"☆38Updated 10 months ago
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆53Updated 4 months ago
- ☆49Updated last year
- Methods and evaluation for aligning language models temporally☆29Updated last year
- [ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization☆24Updated 3 months ago
- ☆16Updated 10 months ago
- About Official PyTorch implementation of "Query-Efficient Black-Box Red Teaming via Bayesian Optimization" (ACL'23)☆14Updated last year
- This is the official repo for Towards Uncertainty-Aware Language Agent.☆24Updated 8 months ago
- official code for paper Probing the Decision Boundaries of In-context Learning in Large Language Models. https://arxiv.org/abs/2406.11233…☆17Updated 7 months ago
- ☆31Updated 11 months ago
- ☆21Updated last month
- Code for "Tracing Knowledge in Language Models Back to the Training Data"☆37Updated 2 years ago
- Code for "Universal Adversarial Triggers Are Not Universal."☆17Updated 11 months ago
- [NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"☆60Updated last year
- This is the official implementation of ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting☆17Updated 8 months ago
- [AAAI 2024] MELO: Enhancing Model Editing with Neuron-indexed Dynamic LoRA☆26Updated last year
- ☆14Updated 6 months ago
- Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model☆68Updated 2 years ago
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆57Updated last year