chenlong-clock / RULE-UnlearnLinks
[NeurIPS25] RULE: Reinforcement UnLEarning Achieves Forge-retain Pareto Optimality
☆18Updated 2 months ago
Alternatives and similar repositories for RULE-Unlearn
Users that are interested in RULE-Unlearn are comparing it to the libraries listed below
Sorting:
- RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024☆87Updated last year
- Official code for our paper "Reasoning Models Hallucinate More: Factuality-Aware Reinforcement Learning for Large Reasoning Models"☆20Updated 2 months ago
- ☆60Updated 6 months ago
- [COLM 2025] SEAL: Steerable Reasoning Calibration of Large Language Models for Free☆50Updated 9 months ago
- Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to …☆55Updated 3 weeks ago
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆62Updated last year
- code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models☆48Updated last year
- Official Code and data for ACL 2024 finding, "An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models"☆24Updated last year
- This repo is for the safety topic, including attacks, defenses and studies related to reasoning and RL☆59Updated 4 months ago
- Code for ACL 2024 accepted paper titled "SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language …☆38Updated last year
- ☆43Updated last year
- [ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"☆93Updated last year
- ☆25Updated 2 years ago
- ☆32Updated 10 months ago
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆46Updated last year
- [2025-TMLR] A Survey on the Honesty of Large Language Models☆64Updated last year
- [NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"☆61Updated 2 years ago
- Official code for ICML 2024 paper on Persona In-Context Learning (PICLe)☆26Updated last year
- Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping☆62Updated 7 months ago
- [ACL 2024] Learning to Edit: Aligning LLMs with Knowledge Editing☆36Updated last year
- The official implementation of "ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization…☆16Updated last year
- ☆71Updated last year
- ☆68Updated 10 months ago
- Code for Research Project TLDR☆25Updated 5 months ago
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆73Updated 6 months ago
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆69Updated this week
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.☆87Updated 11 months ago
- [ICLR 2025] Released code for paper "Spurious Forgetting in Continual Learning of Language Models"☆58Updated 8 months ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆70Updated 6 months ago
- [NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"☆38Updated 6 months ago