WORM-MAX / iGSM-Replication-physics-LLMLinks

This repository contains the replication of the iGSM dataset generation process from the Physics of LLM paper by Zeyuan Zhu.

☆18

Alternatives and similar repositories for iGSM-Replication-physics-LLM

Users that are interested in iGSM-Replication-physics-LLM are comparing it to the libraries listed below

Sorting:

hkust-nlp / dart-math
[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
☆108Updated 6 months ago
sail-sg / Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
☆88Updated 8 months ago
thu-wyz / inference_scaling
☆71Updated 7 months ago
Alsace08 / Chain-of-Embedding
[ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"
☆64Updated 6 months ago
tmlr-group / landscape-of-thoughts
[ICLR 2025 Workshop] "Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models"
☆25Updated last week
DAMO-NLP-SG / multilingual_analysis
[NeurIPS 2024] How do Large Language Models Handle Multilingualism?
☆34Updated 7 months ago
Zayne-sprague / To-CoT-or-not-to-CoT
☆24Updated 2 months ago
bethgelab / sober-reasoning
A Sober Look at Language Model Reasoning
☆74Updated last week
genrm-star / genrm-critiques
GenRM-CoT: Data release for verification rationales
☆61Updated 8 months ago
LiuAmber / RAHF
[ACL 2024 main] Aligning Large Language Models with Human Preferences through Representation Engineering (https://aclanthology.org/2024.…
☆25Updated 9 months ago
GeniusHTX / TALE
☆119Updated last month
zitian-gao / SC-MCTS
Interpretable Contrastive Monte Carlo Tree Search Reasoning
☆48Updated 7 months ago
Vance0124 / Token-level-Direct-Preference-Optimization
Reference implementation for Token-level Direct Preference Optimization(TDPO)
☆141Updated 4 months ago
hanxuhu / SeqIns
The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LA…
☆29Updated 7 months ago
Zanette-Labs / efficient-reasoning
☆65Updated 2 months ago
HKUNLP / critic-rl
[ICML 2025] Teaching Language Models to Critique via Reinforcement Learning
☆99Updated last month
Edward-Sun / easy-to-hard
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
☆121Updated 9 months ago
nightdessert / Retrieval_Head
open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality
☆201Updated 10 months ago
NineAbyss / S2R
This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"
☆65Updated 2 months ago
sail-sg / Cheating-LLM-Benchmarks
[ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)
☆79Updated 8 months ago
chujiezheng / LLM-MCQ-Bias
Official repository for ICLR 2024 Spotlight paper "Large Language Models Are Not Robust Multiple Choice Selectors"
☆39Updated last month
SparkJiao / dpo-trajectory-reasoning
[EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".
☆78Updated 5 months ago
deeplearning-wisc / args
☆40Updated last year
Blueyee / Efficient-CoT-LRMs
Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!
☆54Updated 2 months ago
ryoungj / BoLT
Code for "Reasoning to Learn from Latent Thoughts"
☆105Updated 3 months ago
VITA-Group / SEAL
Official code for SEAL: Steerable Reasoning Calibration of Large Language Models for Free
☆28Updated 2 months ago
NingMiao / SelfCheck
Code for the paper <SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning>
☆48Updated last year
alessiodevoto / l2compress
Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."
☆12Updated 6 months ago
tianyi-lab / MiP-Overthinking
Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
☆29Updated 3 weeks ago
WeiXiongUST / Building-Math-Agents-with-Multi-Turn-Iterative-Preference-Learning
This is an official implementation of the paper ``Building Math Agents with Multi-Turn Iterative Preference Learning'' with multi-turn DP…
☆26Updated 6 months ago