GraphPKU / Case_or_Rule
exploring whether LLMs perform case-based or rule-based reasoning
☆28Updated last year
Alternatives and similar repositories for Case_or_Rule:
Users that are interested in Case_or_Rule are comparing it to the libraries listed below
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆38Updated last year
- Official completion of “Training on the Benchmark Is Not All You Need”.☆31Updated 3 months ago
- ☆30Updated 7 months ago
- ☆50Updated 2 months ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆20Updated 2 months ago
- Code for "A Sober Look at Progress in Language Model Reasoning" paper☆36Updated last week
- ☆22Updated 9 months ago
- ☆44Updated 5 months ago
- Code accompanying the paper "Noise Contrastive Alignment of Language Models with Explicit Rewards" (NeurIPS 2024)☆51Updated 5 months ago
- ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment☆55Updated 10 months ago
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆36Updated last year
- ☆20Updated 2 months ago
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"☆49Updated 5 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆64Updated last week
- The code and data for the paper JiuZhang3.0☆43Updated 11 months ago
- ☆22Updated 9 months ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆58Updated 4 months ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆83Updated 7 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆68Updated last month
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆47Updated 3 months ago
- Code for "Reasoning to Learn from Latent Thoughts"☆91Updated 3 weeks ago
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆75Updated 4 months ago
- This the implementation of LeCo☆32Updated 3 months ago
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆26Updated 4 months ago
- Interpretable Contrastive Monte Carlo Tree Search Reasoning☆48Updated 5 months ago
- We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆61Updated 5 months ago
- ☆46Updated 2 months ago
- ☆68Updated last year
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆57Updated 6 months ago
- ☆23Updated 10 months ago