swt-user/DMPO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/swt-user/DMPO)

swt-user / DMPO

☆54

Alternatives and similar repositories for DMPO

Users that are interested in DMPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MadeAgents / Hammer
View on GitHub
Hammer: Robust Function-Calling for On-Device Language Models via Function Masking
☆120Jun 13, 2025Updated last year
Yifan-Song793 / ETO
View on GitHub
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)
☆168Oct 30, 2024Updated last year
microsoft / tale-suite
View on GitHub
Text Adventure Learning Environment Suite - Benchmark to evaluate language models on interactive text environments.
☆30Updated this week
SalesforceAIResearch / xLAM
View on GitHub
xLAM: A Family of Large Action Models to Empower AI Agent Systems
☆634Jun 2, 2026Updated last month
HanjiangHu / NBF-LLM
View on GitHub
The official code for "Steering Dialogue Dynamics for Robustness against Multi-turn Jailbreaking Attacks".
☆18Jun 24, 2026Updated 3 weeks ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
princeton-nlp / ELIZA-Transformer
View on GitHub
[NAACL 2025] Representing Rule-based Chatbots with Transformers
☆23Feb 9, 2025Updated last year
BUAADreamer / Qwen2-VL-History
View on GitHub
Qwen2-VL在文旅领域的LLaMA-Factory微调案例 The case for fine-tuning Qwen2-VL in the field of historical literature and museums
☆15Sep 17, 2024Updated last year
zai-org / ComplexFuncBench
View on GitHub
Complex Function Calling Benchmark.
☆180Jan 20, 2025Updated last year
cassidylaidlaw / orpo
View on GitHub
☆24Nov 11, 2024Updated last year
xjzzzzzzzz / MCPSafety
View on GitHub
☆22Dec 18, 2025Updated 7 months ago
john-hewitt / implicit-ins
View on GitHub
Codebase for Instruction Following without Instruction Tuning
☆36Sep 24, 2024Updated last year
maitrix-org / dynamic-alignment-optimization
View on GitHub
[EMNLP'24 (Main)] DRPO(Dynamic Rewarding with Prompt Optimization) is a tuning-free approach for self-alignment. DRPO leverages a search-…
☆24Nov 17, 2024Updated last year
padas-lab-de / ir-rag-sigir24-persona-rag
View on GitHub
☆55Jun 23, 2026Updated 3 weeks ago
lqzxt / NGTR
View on GitHub
☆14May 26, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
chenchen0103 / ACEBench
View on GitHub
☆187Oct 29, 2025Updated 8 months ago
YiyiyiZhao / siren
View on GitHub
Welcome to the official repository for Siren, a project aimed at understanding and mitigating harmful behaviors in large language models …
☆15Jun 14, 2026Updated last month
xiaomi-research / dasheng-tokenizer
View on GitHub
State-of-the-art continious audio tokenization
☆40Mar 9, 2026Updated 4 months ago
fairyshine / Seal-Tools
View on GitHub
The source code and dataset mentioned in the paper Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmar…
☆57Nov 5, 2024Updated last year
RUC-NLPIR / HiRA
View on GitHub
The code for paper: Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search [SIGIR 2026]
☆65Jul 4, 2025Updated last year
camel-ai / seta-env
View on GitHub
💻 SETA: Scaling Environments for Terminal Agents - Environments
☆143Feb 16, 2026Updated 5 months ago
Reason-Wang / ToolGen
View on GitHub
[ICLR 2025] The official implementation of paper "ToolGen: Unified Tool Retrieval and Calling via Generation"
☆183Mar 26, 2025Updated last year
ant-research / M2-Miner
View on GitHub
[ICLR 2026] M2-Miner: Multi-Agent Enhanced MCTS for Mobile GUI Agent Data Mining
☆55Apr 22, 2026Updated 3 months ago
MIGHTYEZ / Inversion-DPO
View on GitHub
☆19Jul 22, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
KomeijiForce / MetaIE
View on GitHub
This is a meta-model distilled from LLMs for information extraction. This is an intermediate checkpoint that can be well-transferred to a…
☆30Feb 23, 2025Updated last year
Alittleegg / Eureka-Audio
View on GitHub
Eureka-Audio: A 1.7B lightweight audio–language model that matches 7B–30B models on ASR, audio understanding, and paralinguistic reasonin…
☆40Apr 11, 2026Updated 3 months ago
FFTYYY / RoR_relation_extraction
View on GitHub
Code for the paper "Relation of the Relations: A New Formalization of the Relation Extraction Problem"
☆26Jun 12, 2023Updated 3 years ago
camel-ai / gecko
View on GitHub
☆35Jul 8, 2026Updated 2 weeks ago
RUCBM / AgentProcessBench
View on GitHub
☆27Mar 17, 2026Updated 4 months ago
AIGeeksGroup / PresentAgent-2
View on GitHub
PresentAgent-2: Towards Generalist Multimodal Presentation Agents
☆17Jun 5, 2026Updated last month
iitmnlp / Dialogue-Evaluation-with-BERT
View on GitHub
☆31Jan 16, 2021Updated 5 years ago
Hambaobao / Marathon
View on GitHub
Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.
☆10May 16, 2024Updated 2 years ago
apple / ToolSandbox
View on GitHub
☆267Nov 7, 2025Updated 8 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
DocTron-hub / Chart-R1
View on GitHub
Chart-R1: Chain-of-Thought Supervision and Reinforcement for Advanced Chart Reasoner
☆24Aug 7, 2025Updated 11 months ago
AlbertChen1991 / nEM
View on GitHub
Code and data for EMNLP2019 Paper "Uncover the Ground-Truth Relations in Distant Supervision: A Neural Expectation-Maximization Framework…
☆10May 24, 2020Updated 6 years ago
INK-USC / PE2
View on GitHub
Code for paper "Prompt Engineering a Prompt Engineer" (https://arxiv.org/abs/2311.05661)
☆12Aug 1, 2024Updated last year
Open-Source-O1 / o1_Reasoning_Patterns_Study
View on GitHub
☆105Dec 6, 2024Updated last year
nick7nlp / FastCuRL
View on GitHub
FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning (EMNLP 2025)
☆61Oct 10, 2025Updated 9 months ago
thuhcsi / Contextual-Biasing-Dataset
View on GitHub
open-source Mandarian biased word dataset
☆14Sep 21, 2023Updated 2 years ago
Jazzcharles / AuroLA
View on GitHub
☆28Feb 23, 2026Updated 4 months ago