Towards a Mechanistic Understanding of Large Reasoning Models: A Survey of Training, Inference, and Failures
☆31Jan 29, 2026Updated last month
Alternatives and similar repositories for Awesome-LRM-Mechanisms
Users that are interested in Awesome-LRM-Mechanisms are comparing it to the libraries listed below
Sorting:
- Official repository for the paper Number Cookbook: Number Understanding of Language Models and How to Improve It.☆19Mar 31, 2025Updated 11 months ago
- ☆30Oct 22, 2025Updated 4 months ago
- exploring whether LLMs perform case-based or rule-based reasoning☆30Mar 2, 2024Updated 2 years ago
- The Implementation for the Paper "Time-Stamped Language Model: Teaching Language Models toUnderstand The Flow of Events"☆11May 6, 2021Updated 4 years ago
- Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.☆10May 16, 2024Updated last year
- ☆14Oct 19, 2025Updated 4 months ago
- ☆11Apr 12, 2024Updated last year
- A supervised fine-tuning method for controllable reasoning length in large language models (一种通过有监督微调实现大语言模型思考长度可控的方法)☆10May 8, 2025Updated 10 months ago
- A Multi-Agent Framework for Collaborative Criticism and Refinement in Table Reasoning.☆19Aug 23, 2025Updated 6 months ago
- Official codebase for our paper "Do Language Models Use Their Depth Efficiently?"☆29Jun 25, 2025Updated 8 months ago
- LeanDojo-v2 is an end-to-end framework for training, evaluating, and deploying AI-assisted theorem provers for Lean 4.☆49Mar 10, 2026Updated last week
- Official implementation of the paper "M3CoTBench: Benchmark Chain-of-Thought of MLLMs in Medical Image Understanding"☆21Jan 14, 2026Updated 2 months ago
- All-in-One Safety Evaluation Framwork☆44Mar 4, 2026Updated 2 weeks ago
- Text Adventure Learning Environment Suite - Benchmark to evaluate language models on interactive text environments.☆26Feb 18, 2026Updated last month
- JoinAI是一个开源仓库,专注于算法工程能力的培养,包括工程和数学原理的整理☆11Apr 20, 2025Updated 10 months ago
- [NeurIPS25] RULE: Reinforcement UnLEarning Achieves Forge-retain Pareto Optimality☆20Oct 22, 2025Updated 4 months ago
- A curated list of resources on Reinforcement Learning with Verifiable Rewards (RLVR) and the reasoning capability boundary of Large Langu…☆86Dec 12, 2025Updated 3 months ago
- (ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"☆20May 15, 2025Updated 10 months ago
- Code of paper: xJailbreak: Representation Space Guided Reinforcement Learning for Interpretable LLM Jailbreaking"☆17Feb 17, 2026Updated last month
- ☆16Oct 18, 2023Updated 2 years ago
- [ICLR 2024] Towards Elminating Hard Label Constraints in Gradient Inverision Attacks☆14Feb 6, 2024Updated 2 years ago
- EMNLP 2022 Demo "SynKB: Semantic Search for Chemical Synthesis Procedures"☆16Oct 31, 2022Updated 3 years ago
- ☆14Apr 6, 2025Updated 11 months ago
- Focused Papers, Delivered Simply :)☆51Dec 25, 2025Updated 2 months ago
- ☆19Sep 16, 2025Updated 6 months ago
- The official source code for "Boosting LLM Agents with Recursive Contemplation for Effective Deception Handling" (ACL 2024, Findings)☆14Aug 12, 2024Updated last year
- Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models"☆24Mar 4, 2025Updated last year
- The Official Repo for Paper: Aligning Clinical Needs and AI Capabilities: A Survey on LLMs for Medical Reasoning☆22Sep 27, 2025Updated 5 months ago
- [ICLR 2026] Official code for [EdiVal-Agent Automated, object-centric evaluation for multi-turn instruction-based image editing]☆26Mar 1, 2026Updated 2 weeks ago
- Data and code for analyzing language associated with fictional characters.☆15Jan 6, 2018Updated 8 years ago
- Code of paper "AdvReverb: AdvReverb: Rethinking the Stealthiness of Audio Adversarial Examples to Human Perception"☆19Nov 26, 2023Updated 2 years ago
- A Claude Code skill for sending messages to Feishu (飞书/Lark) via Webhook.☆28Feb 14, 2026Updated last month
- Socratic-Zero is a fully autonomous framework that generates high-quality training data for mathematical reasoning☆36Oct 26, 2025Updated 4 months ago
- ☆30Jun 28, 2025Updated 8 months ago
- The official codes of Rethinking Knowledge Graph Evaluation Under the Open-World Assumption (NeurIPS 2022)☆22Sep 20, 2022Updated 3 years ago
- Agent-RRM: Exploring Reasoning Reward Model for Agents☆53Mar 5, 2026Updated last week
- [VLDB'2025] LEAP: LLM-powered End-to-end Automatic Library for Processing Social Science Queries on Unstructured Data☆19Nov 3, 2025Updated 4 months ago
- ☆28Feb 13, 2026Updated last month
- Whole cell model of E. coli implemented with Vivarium☆30Updated this week