fangyuan-ksgk / CoT-Reasoning-without-PromptingLinks
Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting
โ32Updated last year
Alternatives and similar repositories for CoT-Reasoning-without-Prompting
Users that are interested in CoT-Reasoning-without-Prompting are comparing it to the libraries listed below
Sorting:
- [๐๐๐๐๐ ๐ ๐ข๐ง๐๐ข๐ง๐ ๐ฌ ๐๐๐๐ & ๐๐๐ ๐๐๐๐ ๐๐๐๐๐ ๐๐ซ๐๐ฅ] ๐๐ฏ๐ฉ๐ข๐ฏ๐ค๐ช๐ฏ๐จ ๐๐ข๐ต๐ฉ๐ฆ๐ฎ๐ข๐ต๐ช๐ค๐ข๐ญ ๐๐ฆ๐ข๐ด๐ฐ๐ฏ๐ช๐ฏโฆโ51Updated last year
- Large Language Models Can Self-Improve in Long-context Reasoningโ69Updated 6 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learningโ98Updated 3 weeks ago
- This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or reโฆโ31Updated 8 months ago
- [ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoningโ44Updated 10 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correctionโ70Updated 2 months ago
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)โ57Updated 7 months ago
- โ59Updated 9 months ago
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"โ73Updated 2 weeks ago
- โ22Updated 10 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"โ38Updated last year
- official implementation of paper "Process Reward Model with Q-value Rankings"โ59Updated 3 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modelingโ50Updated 5 months ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"โ57Updated 7 months ago
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)โ59Updated last year
- Directional Preference Alignmentโ56Updated 8 months ago
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Mโฆโ26Updated last year
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.โ22Updated 3 months ago
- This is the official implementation of the paper "SยฒR: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"โ64Updated last month
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."โ42Updated 7 months ago
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineeringโ58Updated 6 months ago
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scalingโ102Updated 4 months ago
- [ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.โ63Updated 7 months ago
- Interpretable Contrastive Monte Carlo Tree Search Reasoningโ48Updated 6 months ago
- Codebase for Instruction Following without Instruction Tuningโ34Updated 8 months ago
- [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracyโ61Updated 5 months ago
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073โ28Updated 10 months ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoningโ60Updated 5 months ago
- โ35Updated 3 months ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.โ62Updated 10 months ago