jianghoucheng / AnyEditLinks

AnyEdit: Edit Any Knowledge Encoded in Language Models, ICML 2025

☆33

Alternatives and similar repositories for AnyEdit

Users that are interested in AnyEdit are comparing it to the libraries listed below

Sorting:

wonderNefelibata / Awesome-LRM-Safety
Awesome Large Reasoning Model(LRM) Safety.This repository is used to collect security-related research on large reasoning models such as …
☆76Updated this week
tmlr-group / NoisyRationales
[NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"
☆37Updated 4 months ago
WanliYoung / Revisit-Editing-Evaluation
Code and data repository for "The Mirage of Model Editing: Revisiting Evaluation in the Wild"
☆16Updated 2 months ago
WangCheng0116 / Awesome-LRMs-Safety
Official repository for "Safety in Large Reasoning Models: A Survey" - Exploring safety risks, attacks, and defenses for Large Reasoning …
☆80Updated 2 months ago
AlphaLab-USTC / AlphaSteer
The implementation of paper "AlphaSteer: Learning Refusal Steering with Principled Null-Space Constraint"
☆27Updated last week
TrustGen / TrustEval-toolkit
Toolkit for evaluating the trustworthiness of generative foundation models.
☆123Updated 3 months ago
ZFancy / awesome-activation-engineering
A curated list of resources for activation engineering
☆111Updated last month
tmlr-group / AR-Bench
[ICML 2025] "From Passive to Active Reasoning: Can Large Language Models Ask the Right Questions under Incomplete Information?"
☆47Updated last month
jianghoucheng / AlphaEdit
AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models, ICLR 2025 (Outstanding Paper)
☆366Updated last month
Alsace08 / Chain-of-Embedding
[ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"
☆83Updated 11 months ago
Unispac / shallow-vs-deep-alignment
Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep
☆163Updated 7 months ago
ybwang119 / Awesome-reasoning-safety
This repo is for the safety topic, including attacks, defenses and studies related to reasoning and RL
☆52Updated 2 months ago
ydyjya / SafetyHeadAttribution
☆54Updated 5 months ago
swj0419 / muse_bench
☆30Updated 8 months ago
ChnQ / MI-Peaks
☆55Updated 4 months ago
ZhiningLiu1998 / SelfElicit
[ACL'25 Main] SelfElicit: Your Language Model Secretly Knows Where is the Relevant Evidence! | 让你的LLM更好地利用上下文文档：一个基于注意力的简单方案
☆23Updated 9 months ago
hanshen95 / SEAL
An implementation of SEAL: Safety-Enhanced Aligned LLM fine-tuning via bilevel data selection.
☆20Updated 9 months ago
VanillaCreamer / Awesome-Personalized-LLMs
The latest progress of Personalized Large Language Models (LLMs).
☆29Updated 3 weeks ago
glorgao / SelectiveDPO
Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples
☆44Updated 4 months ago
jinhaoduan / SAR
[ACL 2024] Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models
☆59Updated last year
fangjf1 / OpenSafeMLRM
The first toolkit for MLRM safety evaluation, providing unified interface for mainstream models, datasets, and jailbreaking methods!
☆14Updated 7 months ago
boyiwei / alignment-attribution-code
[ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
☆86Updated 7 months ago
jinzhuoran / RWKU
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024
☆86Updated last year
ydyjya / LLM-IHS-Explanation
☆55Updated last year
MartinPawelczyk / In-Context-Unlearning
"In-Context Unlearning: Language Models as Few Shot Unlearners". Martin Pawelczyk, Seth Neel* and Himabindu Lakkaraju*; ICML 2024.
☆28Updated 2 years ago
Persdre / NeurIPS-2024-LLM-Papers
Accepted LLM Papers in NeurIPS 2024
☆37Updated last year
zepingyu0512 / awesome-SAE
awesome SAE papers
☆59Updated 5 months ago
isXinLiu / MM-SafetyBench
Accepted by ECCV 2024
☆175Updated last year
thu-ml / STAIR
Official codebase for "STAIR: Improving Safety Alignment with Introspective Reasoning"
☆85Updated 8 months ago
SCIR-SC-Qiaoban-Team / FreeEvalLM
☆10Updated 7 months ago