hanningzhang/ER-PRM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hanningzhang/ER-PRM)

hanningzhang / ER-PRM

☆20

Alternatives and similar repositories for ER-PRM

Users that are interested in ER-PRM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Trae1ounG / BuPO
View on GitHub
[arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies
☆60Feb 6, 2026Updated 5 months ago
kyegomez / SelfExtend
View on GitHub
Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta
☆13Nov 11, 2024Updated last year
siat-nlp / NLP-docs
View on GitHub
Docs of NLP/deep Learning/machine learning, etc. https://siat-nlp.github.io/docs
☆11Jul 17, 2019Updated 7 years ago
jinhangzhan / RL_Heals_SFT
View on GitHub
☆21Mar 22, 2026Updated 4 months ago
ZihaoHuang-notabot / Ultra-Sparse-Memory-Network
View on GitHub
☆48Jul 3, 2026Updated 3 weeks ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
AxelSorensenDev / Eevee
View on GitHub
An Easy Annotation Tool for Natural Language Processing
☆12May 17, 2024Updated 2 years ago
Farseer-Scaling-Law / Farseer
View on GitHub
☆21Jun 12, 2025Updated last year
MrZilinXiao / ProxyThinker
View on GitHub
[ICLR 2026] Official Implementation of ProxyThinker: Test-Time Guidance through Small Visual Reasoners.
☆22Sep 24, 2025Updated 10 months ago
hanningzhang / prm
View on GitHub
☆17Nov 3, 2024Updated last year
tinnerhrhe / ROVER
View on GitHub
An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards
☆36Oct 3, 2025Updated 9 months ago
foreverlasting1202 / QuestA
View on GitHub
☆22Jan 2, 2026Updated 6 months ago
longrongyang / STGC
View on GitHub
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
☆13Feb 11, 2025Updated last year
huggingface / peft-pytorch-conference
View on GitHub
Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…
☆15Oct 16, 2023Updated 2 years ago
tongxuluo / LeaP
View on GitHub
Code, Data and Model for Paper "Learning from Peers in Reasoning Models"
☆26May 13, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kobayashikanna01 / Chain-of-Discussion
View on GitHub
☆11May 28, 2024Updated 2 years ago
RUCKBReasoning / CoT-based-Synthesizer
View on GitHub
Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'
☆32May 19, 2025Updated last year
UW-Madison-Lee-Lab / VersaPRM
View on GitHub
☆37Feb 11, 2025Updated last year
wutaiqiang / awesome-GNN2MLP-distillation
View on GitHub
Learning MLPs to replace GNN
☆10Jun 3, 2023Updated 3 years ago
shenao-zhang / reward-augmented-preference
View on GitHub
The official implementation of Preference Data Reward-Augmentation.
☆18May 1, 2025Updated last year
wzhouad / WPO
View on GitHub
Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"
☆41Sep 24, 2024Updated last year
zzh-SJTU / CRT-QA
View on GitHub
The official data and code for EMNLP 2023 main conference paper: CRT-QA: A Dataset of Complex Reasoning Question Answering over Tabular D…
☆13May 19, 2025Updated last year
WilliamZR / ProTrix
View on GitHub
Code for ProTrix: Building Models for Planning and Reasoning over Tables with Sentence Context
☆17Nov 15, 2024Updated last year
pandeydeep9 / EvidentialResearch2023
View on GitHub
Analysis of evidential models
☆15Jun 22, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
YJiangcm / BMC
View on GitHub
[ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
☆12Jan 26, 2025Updated last year
wondergo2017 / sild
View on GitHub
Implementation codes for NeurIPS23 paper "Spectral Invariant Learning for Dynamic Graphs under Distribution Shifts"
☆14Mar 19, 2024Updated 2 years ago
sail-sg / VeriFree
View on GitHub
Reinforcing General Reasoning without Verifiers
☆102Jun 24, 2025Updated last year
WangLabTHU / DeSP
View on GitHub
DNA-D2S: a systematic error simulation Model for DNA Data Storage channel
☆12Feb 14, 2022Updated 4 years ago
fuyuanlyu / AutoFS-in-CTR
View on GitHub
This repository is contains several Automated feature selection methods in CTR Predicition.
☆10Dec 18, 2022Updated 3 years ago
yushuiwx / MH-MoE
View on GitHub
☆20Nov 5, 2024Updated last year
jonnypei / acl23-preadd
View on GitHub
☆12Jul 25, 2023Updated 3 years ago
Louise-LuLin / GCL-SPAN
View on GitHub
Code for the paper "Spectrum Guided Topology Augmentation for Graph Contrastive Learning"
☆11Jul 18, 2023Updated 3 years ago
complex-reasoning / RPG
View on GitHub
[ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)
☆76Jun 29, 2026Updated 3 weeks ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
SalesforceAIResearch / indict_code_gen
View on GitHub
INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness
☆15Jun 2, 2026Updated last month
CommissarSilver / CVT
View on GitHub
This repository contains the replication package of our paper "Assessing the Security of GitHub Copilot’s Generated Code - A Targeted Rep…
☆10Nov 16, 2023Updated 2 years ago
growvv / emo_is_all_you_need
View on GitHub
基于预训练BERT和GAT的剧本角色情绪识别研究
☆13Dec 15, 2023Updated 2 years ago
ReliableCoding / REPEAT
View on GitHub
☆10Apr 15, 2023Updated 3 years ago
Nero0113 / CoSec
View on GitHub
☆13Oct 11, 2024Updated last year
hnmr293 / llama-viz
View on GitHub
The attention map viewer for LLaMA models.
☆36Dec 16, 2023Updated 2 years ago
zhyang2226 / AR-Lopti
View on GitHub
[ICLR 2026] Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs
☆46May 20, 2025Updated last year