QwenLM/WorldPM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/QwenLM/WorldPM)

QwenLM / WorldPM

☆93

Alternatives and similar repositories for WorldPM

Users that are interested in WorldPM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

conceptmath / conceptmath
View on GitHub
[ACL 2024 Findings] The official repo for "ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large …
☆26May 29, 2024Updated 2 years ago
ChengpengLi1003 / DotaMath
View on GitHub
☆30Dec 27, 2024Updated last year
KodCode-AI / code-r1
View on GitHub
Reproducing R1 for Code with Reliable Rewards
☆13Apr 9, 2025Updated last year
jinzhuoran / RAG-RewardBench
View on GitHub
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment
☆18Dec 19, 2024Updated last year
Alexzhuan / awesome-kbqa
View on GitHub
🤡 An up-to-date & curated list of awesome KBQA papers, methods & resources.
☆10Jul 14, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
OFA-Sys / gsm8k-ScRel
View on GitHub
Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
☆268Sep 12, 2024Updated last year
ByteDance-Seed / Seed-Thinking-v1.5
View on GitHub
☆810Jun 9, 2025Updated last year
yingweima2022 / SWE-Reasoner
View on GitHub
☆25Aug 2, 2025Updated 11 months ago
JiahuaHe / DeepMM
View on GitHub
Full-length protein structure determination from cryo-EM maps using deep learning
☆15May 28, 2023Updated 3 years ago
pkshashank / GFLeanTransfer
View on GitHub
☆14Mar 27, 2024Updated 2 years ago
QwenLM / ParScale
View on GitHub
Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling
☆480May 17, 2025Updated last year
hkust-nlp / dart-math
View on GitHub
[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
☆120Dec 10, 2024Updated last year
kakao / diatool-dpo
View on GitHub
☆15Aug 25, 2025Updated 10 months ago
Lagooon / LeanSTaR
View on GitHub
☆44Sep 19, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
fzyzcjy / ai_math_paper_list
View on GitHub
AI for Mathematics Paper List
☆17Jan 14, 2025Updated last year
HAE-RAE / HAERAE-VISION
View on GitHub
Evaluation code for HAERAE-Vision benchmark
☆15Apr 29, 2026Updated 2 months ago
StigLidu / TURN
View on GitHub
[ICML2025] Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"
☆23Feb 16, 2025Updated last year
RUCAIBox / JiuZhang3.0
View on GitHub
The code and data for the paper JiuZhang3.0
☆49May 26, 2024Updated 2 years ago
phonism / CP-Zero
View on GitHub
Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.
☆18Apr 22, 2025Updated last year
CLR-Lab / SimKO
View on GitHub
SimKO: Simple Pass@K Policy Optimization
☆31Oct 24, 2025Updated 8 months ago
ByteDance-Seed / Seed-Coder
View on GitHub
Seed-Coder is a family of lightweight open-source code LLMs comprising base, instruct and reasoning models, developed by ByteDance Seed.
☆754Jun 6, 2025Updated last year
PremiLab-Math / MathCheck
View on GitHub
[ICLR 2025] Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
☆34Oct 23, 2024Updated last year
LLM360 / MegaMath
View on GitHub
[COLM 2025] An Open Math Pre-trainng Dataset with 370B Tokens.
☆110Apr 4, 2025Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
SkyworkAI / Skywork-OR1
View on GitHub
Unleashing the Power of Reinforcement Learning for Math and Code Reasoners
☆739Jun 6, 2025Updated last year
wellecks / llemma_formal2formal
View on GitHub
Llemma formal2formal (tactic prediction) theorem proving experiments
☆20Oct 17, 2023Updated 2 years ago
THUDM / AlignBench
View on GitHub
大模型多维度中文对齐评测基准 (ACL 2024)
☆430Oct 25, 2025Updated 8 months ago
dongguanting / MSDP-Fewshot-NER
View on GitHub
The code of CIKM 2023 (Oral Presentation) : A Multi-Task Semantic Decomposition Framework with Task-specific Pre-training for Few-Shot NE…
☆14Jul 19, 2024Updated 2 years ago
rookie-joe / FormalAlign
View on GitHub
☆17Jul 12, 2025Updated last year
thu-coai / ComplexBench
View on GitHub
Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)
☆102Feb 20, 2025Updated last year
MathAutoTag / mathdata
View on GitHub
K12高中数学试题数据集
☆18Aug 16, 2023Updated 2 years ago
PRIME-RL / PRIME
View on GitHub
Scalable RL solution for advanced reasoning of language models
☆1,865Mar 18, 2025Updated last year
StigLidu / DualDistill
View on GitHub
[EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"
☆104Apr 21, 2026Updated 3 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
ysy-phoenix / evalhub
View on GitHub
All-in-one benchmarking platform for evaluating LLM.
☆15Nov 12, 2025Updated 8 months ago
RM-R1-UIUC / RM-R1
View on GitHub
[ICLR'26] RM-R1: Unleashing the Reasoning Potential of Reward Models
☆167Jun 26, 2025Updated last year
Zhitao-He / AgentsCourt
View on GitHub
AgentsCourt: Building Judicial Decision-Making Agents with Court Debate Simulation and Legal Knowledge Augmentation (EMNLP 2024 Findings)
☆18Dec 30, 2024Updated last year
stepfun-ai / Step3
View on GitHub
☆453Aug 10, 2025Updated 11 months ago
Zhou-Zoey / RMB-Reward-Model-Benchmark
View on GitHub
☆48Mar 25, 2025Updated last year
hyz20 / D2Co
View on GitHub
Uncovering User Interest from Biased and Noised Watch Time in Video Recommendation. In Recsys23.
☆11Jul 18, 2023Updated 3 years ago
RUCAIBox / FIGA
View on GitHub
[ICLR 2024] This is the official implementation for the paper: "Beyond imitation: Leveraging fine-grained quality signals for alignment"
☆10May 5, 2024Updated 2 years ago