ZhentingWang / DUMPView external linksLinks
☆34May 9, 2025Updated 9 months ago
Alternatives and similar repositories for DUMP
Users that are interested in DUMP are comparing it to the libraries listed below
Sorting:
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆51Jul 15, 2025Updated 7 months ago
- AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence☆10Mar 2, 2025Updated 11 months ago
- Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning☆14Jun 28, 2025Updated 7 months ago
- The code for ”T-GRAG: A Dynamic GraphRAG Framework for Resolving Temporal Conflicts and Redundancy in Knowledge Retrieval“☆21Jul 30, 2025Updated 6 months ago
- Exploration of automated dataset selection approaches at large scales.☆52Mar 4, 2025Updated 11 months ago
- ☆33Jan 25, 2026Updated 3 weeks ago
- Implementation for the paper "Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning"☆11Jan 10, 2025Updated last year
- Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).☆12Sep 22, 2025Updated 4 months ago
- [ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi…☆14Jun 6, 2025Updated 8 months ago
- ☆46Sep 27, 2025Updated 4 months ago
- DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding☆66Jun 10, 2025Updated 8 months ago
- ☆15Sep 22, 2024Updated last year
- CS194-196 Course Project☆14Feb 20, 2025Updated 11 months ago
- [ACL 2025 Findings] Text2World: Benchmarking Large Language Models for Symbolic World Model Generation☆27Feb 25, 2025Updated 11 months ago
- Ongoing research project for code&math LLMs☆27Jul 4, 2025Updated 7 months ago
- ☆17Feb 4, 2025Updated last year
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Oct 9, 2025Updated 4 months ago
- LLM4HWDesign Starting Toolkit☆19Oct 4, 2024Updated last year
- Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"☆17Mar 26, 2025Updated 10 months ago
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated 10 months ago
- The code for paper "EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning"☆37Oct 1, 2025Updated 4 months ago
- ☆19Mar 25, 2025Updated 10 months ago
- ☆20Apr 16, 2025Updated 10 months ago
- Control LLM☆22Apr 6, 2025Updated 10 months ago
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆19Mar 4, 2025Updated 11 months ago
- SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution☆24Nov 11, 2025Updated 3 months ago
- Implementation of ICLR 2025 paper "Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation"☆18Oct 5, 2024Updated last year
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆26Aug 9, 2025Updated 6 months ago
- ☆15Feb 21, 2024Updated last year
- [ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".☆20Feb 26, 2025Updated 11 months ago
- ☆20Aug 30, 2025Updated 5 months ago
- ☆19Mar 10, 2025Updated 11 months ago
- ☆20Feb 11, 2024Updated 2 years ago
- [ICML 2025] Official code of "AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization"☆29Jan 10, 2026Updated last month
- ☆20Nov 4, 2025Updated 3 months ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆89Sep 26, 2024Updated last year
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆47Apr 15, 2025Updated 10 months ago
- ☆17Aug 1, 2025Updated 6 months ago
- ☆20Oct 12, 2024Updated last year