TianheL / LM-Implicit-Reasoning
[Arxiv] Implicit Reasoning in Transformers is Reasoning through Shortcuts
☆14Updated last month
Alternatives and similar repositories for LM-Implicit-Reasoning:
Users that are interested in LM-Implicit-Reasoning are comparing it to the libraries listed below
- The official implementation of Preference Data Reward-Augmentation.☆17Updated 6 months ago
- [arXiv 2025] "CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought"☆10Updated 3 weeks ago
- ☆63Updated this week
- ☆46Updated 2 months ago
- [Preprint] A Generalizable and Purely Unsupervised Self-Training Framework☆50Updated last week
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆71Updated 3 weeks ago
- The official implementation of Cross-Task Experience Sharing (COPS)☆22Updated 6 months ago
- [EMNLP 2024 Findings] ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs☆24Updated 6 months ago
- [COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?☆72Updated 3 months ago
- Knowledge Unlearning for Large Language Models☆25Updated 3 weeks ago
- The official implementation of Self-Exploring Language Models (SELM)☆63Updated 10 months ago
- [preprint] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆43Updated 3 months ago
- Official repository for Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning [ICLR 2025]☆43Updated 3 months ago
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆62Updated this week
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Updated 4 months ago
- ☆22Updated last week
- The code of arXiv paper: "Dynamic Scaling of Unit Tests for Code Reward Modeling"☆18Updated 3 months ago
- ☆14Updated 3 months ago
- Official Code for paper "Towards Efficient and Effective Unlearning of Large Language Models for Recommendation" (Frontiers of Computer S…☆36Updated 9 months ago
- What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆63Updated last month
- Code for "A Sober Look at Progress in Language Model Reasoning" paper☆36Updated last week
- Code repository for the paper "The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Le…☆13Updated 3 months ago
- Improving Your Model Ranking on Chatbot Arena by Vote Rigging☆20Updated 2 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆68Updated last month
- Repo for "Z1: Efficient Test-time Scaling with Code"☆53Updated 2 weeks ago
- The code of RouterDC☆57Updated last week
- This repo contains code for the paper "Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM"☆13Updated 3 weeks ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆20Updated 2 months ago
- Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models☆35Updated 6 months ago
- The code implementation of Symbolic-MoE☆27Updated last month