YangLabHKUST / UGPhysicsLinks
Official Repository of UGPhysics Benchmark [ICML 2025]
☆24Updated 5 months ago
Alternatives and similar repositories for UGPhysics
Users that are interested in UGPhysics are comparing it to the libraries listed below
Sorting:
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆152Updated 3 months ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆258Updated 5 months ago
- A comprehensive collection of process reward models.☆134Updated 3 months ago
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆271Updated 3 weeks ago
- Deepseek R1 zero tiny version own reproduce on two A100s.☆83Updated 11 months ago
- A curated list of personalized alignment resources (continually updated).☆56Updated 3 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨☆273Updated last year
- This repository collects various works that reproduce DeepSeek R1, as well as works related to DeepSeek R1 and the DeepSeek series.☆18Updated 9 months ago
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆328Updated last year
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆55Updated last year
- This is the official GitHub repository for our survey paper "Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language …☆169Updated 8 months ago
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆151Updated last year
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆151Updated 11 months ago
- ☆332Updated 8 months ago
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆411Updated 6 months ago
- [NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"☆155Updated 3 months ago
- This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by g…☆38Updated 6 months ago
- ☆182Updated last week
- A collection of survey papers and resources related to Large Language Models (LLMs).☆40Updated last year
- A Framework for LLM-based Multi-Agent Reinforced Training and Inference☆411Updated 2 months ago
- simpleR1: A Simple Framework for Training R1-like Models☆30Updated 5 months ago
- ☆57Updated 2 years ago
- ☆22Updated last year
- ☆14Updated last year
- [NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models☆106Updated last year
- ☆73Updated 9 months ago
- ☆52Updated 10 months ago
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆143Updated 2 months ago
- ☆10Updated 11 months ago
- An index of algorithms for reinforcement learning from human feedback (rlhf))☆92Updated last year