facebookresearch / AbstentionBenchLinks

A holistic benchmark for LLM abstention

☆61

Alternatives and similar repositories for AbstentionBench

Users that are interested in AbstentionBench are comparing it to the libraries listed below

Sorting:

zzwjames / FailureLLMUnlearning
An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)
☆35Updated 9 months ago
yuleiqin / RAIF
A Recipe for Building LLM Reasoners to Solve Complex Instructions
☆29Updated last month
hamishivi / automated-instruction-selection
Exploration of automated dataset selection approaches at large scales.
☆50Updated 9 months ago
LAMDASZ-ML / Self-Backtracking
☆51Updated 9 months ago
shangshang-wang / Resa
Resa: Transparent Reasoning Models via SAEs
☆44Updated 2 months ago
sail-sg / SkyLadder
The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling
☆40Updated last month
sail-sg / dice
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
☆44Updated 7 months ago
NuoJohnChen / JudgeLRM
JudgeLRM: Large Reasoning Models as a Judge
☆40Updated 2 months ago
john-hewitt / implicit-ins
Codebase for Instruction Following without Instruction Tuning
☆36Updated last year
zjunlp / unlearn
[ACL 2025] Knowledge Unlearning for Large Language Models
☆46Updated 2 months ago
katiekang1998 / reasoning_generalization
☆33Updated 10 months ago
sail-sg / feedback-conditional-policy
Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"
☆53Updated 2 months ago
fangyuan-ksgk / CoT-Reasoning-without-Prompting
Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting
☆34Updated last year
uservan / ThinkPO
☆17Updated 4 months ago
open-compass / GPassK
[ACL 2025] Are Your LLMs Capable of Stable Reasoning?
☆31Updated 4 months ago
sail-sg / VeriFree
Reinforcing General Reasoning without Verifiers
☆92Updated 5 months ago
sunblaze-ucb / omega
☆42Updated 5 months ago
menhguin / minp_paper
Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper
☆45Updated 3 months ago
mandyyyyii / east
☆20Updated 4 months ago
Infini-AI-Lab / GRESO
☆69Updated 5 months ago
bigai-nlco / RuleReasoner
Official Repo for RuleReasoner.
☆28Updated 5 months ago
test-time-interaction / TTI
☆65Updated 5 months ago
Gen-Verse / CURE
[NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning
☆135Updated 2 months ago
limenlp / safer-instruct
This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"
☆17Updated last year
g-luo / vlm_cross_modal_reps
Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025
☆31Updated 7 months ago
sotopia-lab / sotopia-rl
Sotopia-RL: Reward Design for Social Intelligence
☆44Updated 3 months ago
xufangzhi / phi-Decoding
[ACL 2025] An inference-time decoding strategy with adaptive foresight sampling
☆106Updated 6 months ago
qiuzh20 / EMoE
Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]
☆37Updated last year
wang-kee / LiNeS
Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"
☆31Updated last year
google-deepmind / bbeh
☆105Updated 6 months ago