AlphaPav / mem-kk-logicLinks
On Memorization of Large Language Models in Logical Reasoning
☆72Updated 9 months ago
Alternatives and similar repositories for mem-kk-logic
Users that are interested in mem-kk-logic are comparing it to the libraries listed below
Sorting:
- [ACL 2025] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLM…☆68Updated last year
- [NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*☆119Updated last year
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆150Updated 10 months ago
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆143Updated last month
- ☆104Updated last year
- GenRM-CoT: Data release for verification rationales☆68Updated last year
- A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architect…☆130Updated 2 months ago
- RL Scaling and Test-Time Scaling (ICML'25)☆112Updated 11 months ago
- ☆215Updated 10 months ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆65Updated last year
- Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering☆63Updated last year
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆134Updated 9 months ago
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆191Updated 11 months ago
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆114Updated 5 months ago
- The official repository of the Omni-MATH benchmark.☆91Updated last year
- ☆109Updated 5 months ago
- Code implementation of synthetic continued pretraining☆144Updated last year
- ☆53Updated 10 months ago
- ☆26Updated last year
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆142Updated 10 months ago
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆70Updated 9 months ago
- Towards Systematic Measurement for Long Text Quality☆37Updated last year
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆168Updated last year
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆83Updated 11 months ago
- A research repo for experiments about Reinforcement Finetuning☆53Updated 9 months ago
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…☆29Updated last year
- ☆96Updated last year
- Model merging is a highly efficient approach for long-to-short reasoning.☆95Updated 2 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆159Updated last year
- [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy☆76Updated 3 months ago