multimodal-art-projection/REER_DeepWriter

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/multimodal-art-projection/REER_DeepWriter)

multimodal-art-projection / REER_DeepWriter

REverse-Engineered Reasoning for Open-Ended Generation

☆98

Alternatives and similar repositories for REER_DeepWriter

Users that are interested in REER_DeepWriter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

GAIR-NLP / OctoThinker
View on GitHub
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆189Jul 23, 2025Updated 11 months ago
Quehry / HelloBench
View on GitHub
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models
☆60Nov 26, 2024Updated last year
yaof20 / verl
View on GitHub
verl: Volcano Engine Reinforcement Learning for LLMs
☆22Nov 6, 2025Updated 8 months ago
zjunlp / LightThinker
View on GitHub
[EMNLP 2025] LightThinker: Thinking Step-by-Step Compression
☆165Jun 22, 2026Updated 3 weeks ago
ByteDance-Seed / WideSearch
View on GitHub
WideSearch: Benchmarking Agentic Broad Info-Seeking
☆147Oct 9, 2025Updated 9 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
sail-sg / VeriFree
View on GitHub
Reinforcing General Reasoning without Verifiers
☆102Jun 24, 2025Updated last year
yongliang-wu / DFT
View on GitHub
[ICLR 2026] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.
☆587Jan 4, 2026Updated 6 months ago
Freder-chen / ReasonGenRM
View on GitHub
A simple implementation of ReasonGenRM.
☆19Apr 21, 2025Updated last year
CLR-Lab / SimKO
View on GitHub
SimKO: Simple Pass@K Policy Optimization
☆31Oct 24, 2025Updated 8 months ago
SkyworkAI / Skywork-Reward-V2
View on GitHub
Scaling Preference Data Curation via Human-AI Synergy
☆151Jul 3, 2025Updated last year
D2I-ai / dasd-thinking
View on GitHub
☆105Jan 27, 2026Updated 5 months ago
ZephinueCode / TeamCode
View on GitHub
☆18May 18, 2026Updated 2 months ago
inclusionAI / PromptCoT
View on GitHub
A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architect…
☆131Jan 31, 2026Updated 5 months ago
chentong0 / rl-binary-rar
View on GitHub
Official repo for "Binary Retrieval-augmented Reward Mitigates Hallucinations"
☆15Nov 13, 2025Updated 8 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
TsinghuaC3I / Unify-Post-Training
View on GitHub
Towards a Unified View of Large Language Model Post-Training
☆211Sep 8, 2025Updated 10 months ago
yuleiqin / RAIF
View on GitHub
A Recipe for Building LLM Reasoners to Solve Complex Instructions
☆32Oct 9, 2025Updated 9 months ago
facebookresearch / llm_souping
View on GitHub
Model souping for LLMs
☆73Nov 18, 2025Updated 8 months ago
zhangmiaosen2000 / Towards-On-Policy-SFT
View on GitHub
☆19Mar 26, 2026Updated 3 months ago
Hui-design / R1-Video-fixbug
View on GitHub
[Blog 1] Recording a bug of grpo_trainer in some R1 projects
☆23Feb 23, 2025Updated last year
KOR-Bench / KOR-Bench
View on GitHub
☆19Nov 12, 2024Updated last year
MetaStone-AI / MetaStone-S1
View on GitHub
The open-source code of MetaStone-S1.
☆106Aug 1, 2025Updated 11 months ago
TheRoadQaQ / ReLIFT
View on GitHub
Official Repository of "Learning what reinforcement learning can't"
☆84Dec 30, 2025Updated 6 months ago
OpenBMB / RLPR
View on GitHub
Extrapolating RLVR to General Domains without Verifiers
☆205Aug 12, 2025Updated 11 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
liushulinle / UloRL
View on GitHub
An Ultra-Long Output Reinforcement Learning Approach
☆23Jul 31, 2025Updated 11 months ago
ChenxinAn-fdu / POLARIS
View on GitHub
Scaling RL on advanced reasoning models
☆691Oct 20, 2025Updated 9 months ago
TIGER-AI-Lab / Hierarchical-Reasoner
View on GitHub
Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning [ICLR26]
☆64Apr 11, 2026Updated 3 months ago
wenjunli-0 / deepresearch-survey
View on GitHub
a survey on deep research
☆48Sep 9, 2025Updated 10 months ago
RM-R1-UIUC / RM-R1
View on GitHub
[ICLR'26] RM-R1: Unleashing the Reasoning Potential of Reward Models
☆167Jun 26, 2025Updated last year
GAIR-NLP / MegaScience
View on GitHub
[COLM 2026] MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning
☆123Jul 9, 2026Updated last week
ChengpengLi1003 / CoRT
View on GitHub
☆72Oct 23, 2025Updated 8 months ago
THU-KEG / LongWriter-V
View on GitHub
[ACM MM25] LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models
☆24Mar 29, 2025Updated last year
McGill-AML / mcfoamy_gazebo
View on GitHub
☆10Jul 14, 2021Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
TIGER-AI-Lab / VL-Rethinker
View on GitHub
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆189Jun 5, 2025Updated last year
Linn3a / siren
View on GitHub
Official implementation of Selective Entropy Regularization (SIREN), proposed by paper 'Rethinking Entropy Regularization in Large Reason…
☆32Dec 10, 2025Updated 7 months ago
multimodal-art-projection / IV-Bench
View on GitHub
☆14Apr 23, 2025Updated last year
UCSC-VLAA / ReasoningEval
View on GitHub
Official repo of Knowledge or Reasoning? A Close Look at How LLMs Think Across Domains.
☆43Jun 6, 2025Updated last year
tinnerhrhe / ROVER
View on GitHub
An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards
☆36Oct 3, 2025Updated 9 months ago
SalesforceAIResearch / PretrainRL-pipeline
View on GitHub
An automated data pipeline scaling RL to pretraining levels
☆76Jun 2, 2026Updated last month
multimodal-art-projection / NL2RepoBench
View on GitHub
☆144May 13, 2026Updated 2 months ago