Edward-Sun/easy-to-hard

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Edward-Sun/easy-to-hard)

Edward-Sun / easy-to-hard

Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision

☆124

Alternatives and similar repositories for easy-to-hard

Users that are interested in easy-to-hard are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Edward-Sun / gpt-accelera
View on GitHub
Simple and efficient pytorch-native transformer training and inference (batched)
☆78Apr 2, 2024Updated 2 years ago
hkust-nlp / dart-math
View on GitHub
[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
☆120Dec 10, 2024Updated last year
hkust-nlp / B-STaR
View on GitHub
B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
☆86May 21, 2025Updated last year
koalazf99 / nanoverl
View on GitHub
Collections of RLxLM experiments using minimal codes
☆14Feb 17, 2025Updated last year
FreedomIntelligence / OVM
View on GitHub
☆74Apr 2, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
StigLidu / TURN
View on GitHub
[ICML2025] Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"
☆23Feb 16, 2025Updated last year
IBM / SALMON
View on GitHub
Self-Alignment with Principle-Following Reward Models
☆170Sep 18, 2025Updated 10 months ago
genrm-star / genrm-critiques
View on GitHub
GenRM-CoT: Data release for verification rationales
☆68Oct 16, 2024Updated last year
hkust-nlp / mstar
View on GitHub
[ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning
☆75Jul 13, 2025Updated last year
OpenBMB / Eurus
View on GitHub
☆322Sep 18, 2024Updated last year
mukhal / GRACE
View on GitHub
[EMNLP '23] Discriminator-Guided Chain-of-Thought Reasoning
☆50Oct 11, 2024Updated last year
hkust-nlp / llm-compression-intelligence
View on GitHub
Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]
☆150Sep 20, 2024Updated last year
MARIO-Math-Reasoning / Super_MARIO
View on GitHub
☆341Jun 5, 2025Updated last year
THUDM / ReST-MCTS
View on GitHub
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
☆709Jan 20, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
icip-cas / Verifier-Engineering
View on GitHub
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
☆63Dec 5, 2024Updated last year
alecwangcq / f-divergence-dpo
View on GitHub
Direct preference optimization with f-divergences.
☆17Nov 3, 2024Updated last year
jesse-michael-han / lean-tpe-public
View on GitHub
The Lean Theorem Proving Environment
☆15May 7, 2023Updated 3 years ago
hkust-nlp / deita
View on GitHub
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
☆599Dec 9, 2024Updated last year
Asap7772 / fewshot-preference-optimization
View on GitHub
Few-Shot Preference Optimization (FSPO) personalizes LLMs by reframing reward modeling as a meta-learning problem, enabling rapid adaptat…
☆16Feb 27, 2025Updated last year
uclaml / SPIN
View on GitHub
The official implementation of Self-Play Fine-Tuning (SPIN)
☆1,247May 8, 2024Updated 2 years ago
TomSheng21 / AdaptGuard
View on GitHub
ICCV 2023 - AdaptGuard: Defending Against Universal Attacks for Model Adaptation
☆11Dec 23, 2023Updated 2 years ago
GAIR-NLP / ReasonEval
View on GitHub
[AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy
☆80Oct 9, 2025Updated 9 months ago
qtli / GSM-Plus
View on GitHub
GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.
☆66Jul 8, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
alif-munim / minOFT
View on GitHub
A minimal re-implementation of orthogonal fine-tuning (OFT), a diffusion method, for LLMs. Based on nanoGPT and minLoRA.
☆14Nov 17, 2023Updated 2 years ago
MikaStars39 / StableMask
View on GitHub
PyTorch implementation of StableMask (ICML'24)
☆15Jun 27, 2024Updated 2 years ago
allenai / easy-to-hard-generalization
View on GitHub
Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"
☆48Jan 17, 2024Updated 2 years ago
EleutherAI / semantic-memorization
View on GitHub
☆44Nov 17, 2024Updated last year
Lagooon / LeanSTaR
View on GitHub
☆44Sep 19, 2024Updated last year
McGill-NLP / VinePPO
View on GitHub
Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"
☆192May 25, 2025Updated last year
cmu-l3 / llmlean
View on GitHub
LLMs + Lean, on your laptop or in the cloud
☆213Oct 10, 2025Updated 9 months ago
meta-math / MetaMath
View on GitHub
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
☆455Feb 1, 2024Updated 2 years ago
cmu-l3 / ntptutorial-II
View on GitHub
Neural theorem proving tutorial, version II
☆39Apr 26, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
cmu-l3 / minictx-eval
View on GitHub
Neural theorem proving evaluation via the Lean REPL
☆24Jul 12, 2025Updated last year
hkust-nlp / deepsearch-tts
View on GitHub
Pushing Test-Time Scaling Limits of Deep Search with Asymmetric Verification
☆21Oct 8, 2025Updated 9 months ago
kttian / llm_factuality_tuning
View on GitHub
☆40May 2, 2024Updated 2 years ago
hkust-nlp / LOCA-bench
View on GitHub
Benchmarking Language Agents Under Controllable and Extreme Context Growth
☆50Apr 29, 2026Updated 2 months ago
eddycmu / demystify-long-cot
View on GitHub
☆336May 31, 2025Updated last year
ai4reason / ATP_Proofs
View on GitHub
Interesting ATP Proofs
☆13Sep 3, 2021Updated 4 years ago
lqtrung1998 / mwp_ReFT
View on GitHub
☆554Jan 2, 2025Updated last year