facebookresearch / AbstentionBenchLinks
A holistic benchmark for LLM abstention
☆53Updated last month
Alternatives and similar repositories for AbstentionBench
Users that are interested in AbstentionBench are comparing it to the libraries listed below
Sorting:
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆29Updated 7 months ago
- Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"☆33Updated this week
- ☆48Updated 7 months ago
- Reinforcing General Reasoning without Verifiers☆87Updated 3 months ago
- Exploration of automated dataset selection approaches at large scales.☆47Updated 7 months ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆33Updated last year
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆44Updated 5 months ago
- JudgeLRM: Large Reasoning Models as a Judge☆39Updated 2 weeks ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆34Updated last month
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆24Updated 2 months ago
- ☆62Updated 3 months ago
- [ACL 2025] Knowledge Unlearning for Large Language Models☆42Updated 2 weeks ago
- ☆20Updated 2 months ago
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆40Updated 10 months ago
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆134Updated 3 months ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆30Updated 2 months ago
- ☆18Updated 2 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Updated 4 months ago
- ☆60Updated 3 months ago
- [NeurIPS 2025 Spotlight] ReasonFlux-Coder: Open-Source LLM Coders with Co-Evolving Reinforcement Learning☆122Updated 2 weeks ago
- ☆33Updated 8 months ago
- [ACL 2025] An inference-time decoding strategy with adaptive foresight sampling☆104Updated 4 months ago
- Long Context Extension and Generalization in LLMs☆60Updated last year
- [COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?☆80Updated 8 months ago
- Codebase for Instruction Following without Instruction Tuning☆35Updated last year
- ☆94Updated 4 months ago
- Code for "Reasoning to Learn from Latent Thoughts"☆119Updated 6 months ago
- A Sober Look at Language Model Reasoning☆83Updated 3 weeks ago
- ☆52Updated 3 months ago
- Tree prompting: easy-to-use scikit-learn interface for improved prompting.☆41Updated last year