papers related to Direct Preference Optimization(DPO)
☆20Jul 16, 2024Updated last year
Alternatives and similar repositories for awesome-DPO
Users that are interested in awesome-DPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- RLVR for LLMs in optimization modeling☆51Apr 15, 2026Updated 3 weeks ago
- ☆29Apr 28, 2026Updated last week
- Interpreting Learned Search and Planning: Reverse-engineering recurrent convolutional networks (DRC) that play Sokoban☆19Jun 29, 2025Updated 10 months ago
- ☆10May 16, 2021Updated 4 years ago
- ☆34Jul 2, 2025Updated 10 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆13Jul 15, 2024Updated last year
- An alfred Synology workflow build with Synology Python API☆18Feb 16, 2023Updated 3 years ago
- Engineering degree thesis - Structured Light based 3D Scanner☆13Mar 15, 2017Updated 9 years ago
- 🔥🔥🔥Latest Papers, Codes on Uncertainty-based RL☆59Aug 24, 2025Updated 8 months ago
- Estimating neural network runtime characteristics☆12Mar 25, 2023Updated 3 years ago
- The official implementation of Preference Data Reward-Augmentation.☆18May 1, 2025Updated last year
- From-Classification-to-Clinical☆13Apr 26, 2024Updated 2 years ago
- ☆16May 22, 2025Updated 11 months ago
- a tiny project to test the effectiveness of video QA through RAG techniques and multimodal LLMs☆15Jun 2, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Algebraic value editing in pretrained language models☆70Nov 1, 2023Updated 2 years ago
- Codebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs☆22Apr 24, 2025Updated last year
- ☆30Dec 22, 2022Updated 3 years ago
- [SIGIR'25] Code of "Generative Recommender with End-to-End Learnable Item Tokenization".☆29Apr 17, 2025Updated last year
- Record experiment data easily☆14Aug 13, 2022Updated 3 years ago
- FairGAN: GANs-based Fairness-aware Learning for Recommendations with Implicit Feedback☆15Oct 8, 2022Updated 3 years ago
- ☆18Oct 8, 2024Updated last year
- Models, data, and codes for the paper: MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models☆25Sep 26, 2024Updated last year
- ☆17Jun 30, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- The datasets of TSAD☆17Oct 20, 2025Updated 6 months ago
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- All in How You Ask for It: Simple Black-Box Method for Jailbreak Attacks☆18Apr 24, 2024Updated 2 years ago
- Summer course teamwork: a set of cv tools based on PySide6 and opencv☆14Oct 5, 2023Updated 2 years ago
- [ICLR 2026] Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs☆45May 20, 2025Updated 11 months ago
- [NeurIPS‘2021] "MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge", Geng Yuan, Xiaolong Ma, Yanzhi Wang et al…☆17Mar 16, 2022Updated 4 years ago
- "AI Commit Message Tool uses AI to automatically generate concise and professional Git commit messages, which you can then edit and confi…☆14Jul 14, 2025Updated 9 months ago
- LLMAD code☆28Oct 31, 2024Updated last year
- ☆78Jun 28, 2025Updated 10 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆17Aug 1, 2025Updated 9 months ago
- ☆29Jul 16, 2024Updated last year
- A Survey of Direct Preference Optimization (DPO)☆97Jul 4, 2025Updated 10 months ago
- ☆16Sep 5, 2023Updated 2 years ago
- Implementation of paper "Do Wide and Deep Networks Learn the Same Things?"☆16Mar 15, 2022Updated 4 years ago
- The Neural Combinatorial Optimization Library (NCOLib) is an accessible software library designed to simplify the application of neural n…☆19Nov 14, 2025Updated 5 months ago
- ICS_2020_PJ☆11Dec 25, 2020Updated 5 years ago