SCIR-SC-Qiaoban-Team / FreeEvalLMLinks

☆10

Alternatives and similar repositories for FreeEvalLM

Users that are interested in FreeEvalLM are comparing it to the libraries listed below

Sorting:

princeton-nlp / benign-data-breaks-safety
☆41Updated last year
VITA-Group / SEAL
[COLM 2025] SEAL: Steerable Reasoning Calibration of Large Language Models for Free
☆45Updated 7 months ago
ChnQ / MI-Peaks
☆55Updated 4 months ago
git-disl / Vaccine
This is the official code for the paper "Vaccine: Perturbation-aware Alignment for Large Language Models" (NeurIPS2024)
☆47Updated last year
ChnQ / TracingLLM
☆30Updated last year
Raibows / CREAM
Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.
☆27Updated 9 months ago
LiuAmber / RAHF
[ACL 2024 main] Aligning Large Language Models with Human Preferences through Representation Engineering (https://aclanthology.org/2024.…
☆28Updated last year
bethgelab / sober-reasoning
A Sober Look at Language Model Reasoning
☆87Updated last week
Zanette-Labs / efficient-reasoning
☆67Updated 7 months ago
boyiwei / alignment-attribution-code
[ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
☆86Updated 7 months ago
ZHZisZZ / modpo
[ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
☆92Updated last year
ChenmienTan / malmen
☆35Updated last year
ybwang119 / Awesome-reasoning-safety
This repo is for the safety topic, including attacks, defenses and studies related to reasoning and RL
☆52Updated 2 months ago
limenlp / verl
AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
☆47Updated 5 months ago
SophieZheng998 / ALI-Agent
Official implementation for "ALI-Agent: Assessing LLMs'Alignment with Human Values via Agent-based Evaluation"
☆21Updated 3 months ago
uw-nsl / safechain
[ACL 25] SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities
☆25Updated 7 months ago
hkust-nlp / Activation_Decoding
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)
☆62Updated last year
rookie-joe / AutoPSV
☆50Updated last year
jinzhuoran / RWKU
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024
☆86Updated last year
sail-sg / CPO
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
☆132Updated 8 months ago
PKU-Alignment / llms-resist-alignment
[ACL2025 Best Paper] Language Models Resist Alignment
☆36Updated 5 months ago
tmlr-group / NoisyRationales
[NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"
☆37Updated 4 months ago
wizard-III / ArcherCodeR
ArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement …
☆43Updated 3 months ago
sanowl / Self-Correcting-LLM--Reinforcement-Learning-
This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by g…
☆37Updated 4 months ago
Zhou-Zoey / RMB-Reward-Model-Benchmark
☆45Updated 7 months ago
yuzhaouoe / SAE-based-representation-engineering
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆67Updated 11 months ago
TrustedLLM / UnKE
☆21Updated 9 months ago
SihengLi99 / LLM-Honesty-Survey
[2025-TMLR] A Survey on the Honesty of Large Language Models
☆62Updated 11 months ago
Zayne-sprague / To-CoT-or-not-to-CoT
☆25Updated 7 months ago
WooooDyy / MathCritique
Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".
☆56Updated 11 months ago