WORM-MAX / iGSM-Replication-physics-LLMLinks
This repository contains the replication of the iGSM dataset generation process from the Physics of LLM paper by Zeyuan Zhu.
β18Updated 9 months ago
Alternatives and similar repositories for iGSM-Replication-physics-LLM
Users that are interested in iGSM-Replication-physics-LLM are comparing it to the libraries listed below
Sorting:
- [NeurIPS'24] Official code for *π―DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*β108Updated 6 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)β88Updated 8 months ago
- β71Updated 7 months ago
- [ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"β64Updated 6 months ago
- [ICLR 2025 Workshop] "Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models"β25Updated last week
- [NeurIPS 2024] How do Large Language Models Handle Multilingualism?β34Updated 7 months ago
- β24Updated 2 months ago
- A Sober Look at Language Model Reasoningβ74Updated last week
- GenRM-CoT: Data release for verification rationalesβ61Updated 8 months ago
- [ACL 2024 main] Aligning Large Language Models with Human Preferences through Representation Engineering (https://aclanthology.org/2024.β¦β25Updated 9 months ago
- β119Updated last month
- Interpretable Contrastive Monte Carlo Tree Search Reasoningβ48Updated 7 months ago
- Reference implementation for Token-level Direct Preference Optimization(TDPO)β141Updated 4 months ago
- The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LAβ¦β29Updated 7 months ago
- β65Updated 2 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learningβ99Updated last month
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervisionβ121Updated 9 months ago
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factualityβ201Updated 10 months ago
- This is the official implementation of the paper "SΒ²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"β65Updated 2 months ago
- [ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)β79Updated 8 months ago
- Official repository for ICLR 2024 Spotlight paper "Large Language Models Are Not Robust Multiple Choice Selectors"β39Updated last month
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".β78Updated 5 months ago
- β40Updated last year
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!β54Updated 2 months ago
- Code for "Reasoning to Learn from Latent Thoughts"β105Updated 3 months ago
- Official code for SEAL: Steerable Reasoning Calibration of Large Language Models for Freeβ28Updated 2 months ago
- Code for the paper <SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning>β48Updated last year
- Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."β12Updated 6 months ago
- Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?β29Updated 3 weeks ago
- This is an official implementation of the paper ``Building Math Agents with Multi-Turn Iterative Preference Learning'' with multi-turn DPβ¦β26Updated 6 months ago