[COLM 2025] SEAL: Steerable Reasoning Calibration of Large Language Models for Free
☆55Apr 6, 2025Updated 11 months ago
Alternatives and similar repositories for SEAL
Users that are interested in SEAL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [EMNLP 25] An effective and interpretable weight-editing method for mitigating overly short reasoning in LLMs, and a mechanistic study un…☆17Dec 17, 2025Updated 3 months ago
- DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails☆32Feb 26, 2025Updated last year
- ☆19Aug 4, 2025Updated 7 months ago
- ☆18Aug 19, 2024Updated last year
- [ICLR 2025] On Evluating the Durability of Safegurads for Open-Weight LLMs☆13Jun 20, 2025Updated 9 months ago
- Code for Reducing Hallucinations in Vision-Language Models via Latent Space Steering☆105Nov 23, 2024Updated last year
- [ICLR 2025] FLAT: LLM Unlearning via Loss Adjustment with Only Forget Data☆14Feb 26, 2025Updated last year
- ☆15Feb 26, 2025Updated last year
- Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision☆19Apr 1, 2025Updated 11 months ago
- Inverse Scaling in Test-Time Compute☆25Dec 3, 2025Updated 3 months ago
- [NeurIPS 2024] Can Language Models Learn to Skip Steps?☆22Jan 25, 2025Updated last year
- Awesome Large Reasoning Model(LRM) Safety.This repository is used to collect security-related research on large reasoning models such as …☆82Updated this week
- ☆29Nov 16, 2025Updated 4 months ago
- [ICLR 2025] Language Imbalance Driven Rewarding for Multilingual Self-improving☆24Aug 25, 2025Updated 6 months ago
- Official Repo of Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents☆68Oct 28, 2025Updated 4 months ago
- Paper list for the paper "Authorship Attribution in the Era of Large Language Models: Problems, Methodologies, and Challenges (SIGKDD Exp…☆18Updated this week
- ☆16Feb 8, 2024Updated 2 years ago
- Preparing for ML Interviews.☆53Jan 12, 2026Updated 2 months ago