ZhentingWang / DUMPLinks

☆32

Alternatives and similar repositories for DUMP

Users that are interested in DUMP are comparing it to the libraries listed below

Sorting:

UCSB-NLP-Chang / ThinkPrune
☆45Updated last month
sail-sg / AnytimeReasoner
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
☆47Updated 4 months ago
aeroplanepaper / GRPO-LEAD
☆30Updated 2 months ago
hkust-nlp / RL-Verifier-Robustness
From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.
☆23Updated last month
ryoungj / BoLT
Code for "Reasoning to Learn from Latent Thoughts"
☆122Updated 7 months ago
sail-sg / Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
☆135Updated 4 months ago
bethgelab / sober-reasoning
A Sober Look at Language Model Reasoning
☆87Updated this week
euclid-multimodal / Euclid
☆17Updated 10 months ago
eric-ai-lab / MSSBench
[ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"
☆30Updated 4 months ago
holarissun / RewardModelingBeyondBradleyTerry
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…
☆69Updated 7 months ago
hkust-nlp / mstar
[ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning
☆69Updated 4 months ago
MJ-Bench / MJ-Bench
Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"
☆49Updated 5 months ago
luka-group / vlm-knowledge-conflict
Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."
☆48Updated last year
TIGER-AI-Lab / Hierarchical-Reasoner
Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning
☆48Updated 3 weeks ago
ShadeCloak / ADORA
☆46Updated 7 months ago
sail-sg / ActivePRM
☆19Updated 7 months ago
kokolerk / TON
[NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models
☆48Updated last month
limenlp / verl
AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
☆47Updated 5 months ago
princeton-pli / what-makes-good-rm
[NeurIPS 2025] What Makes a Reward Model a Good Teacher? An Optimization Perspective
☆39Updated 2 months ago
yunfeixie233 / ViGaL
☆62Updated last month
ChnQ / TracingLLM
☆30Updated last year
shiqichen17 / VLM_Merging
Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)
☆81Updated last month
which47 / LLMCL
Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning
☆36Updated last year
GATECH-EIC / ACT
[ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…
☆46Updated last year
test-time-interaction / TTI
☆64Updated 5 months ago
THU-KEG / RM-Bench
[ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
☆67Updated 4 months ago
sail-sg / dice
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
☆44Updated 7 months ago
ZHZisZZ / weak-to-strong-search
[NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models
☆63Updated 11 months ago
bigai-nlco / LatentSeek
Official Repository of LatentSeek
☆68Updated 5 months ago
nishadsinghi / sc-genrm-scaling
[COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…
☆13Updated 3 weeks ago