This repository contains code for the paper Direct Preference Optimization with an Offset (ODPO).
☆19Feb 17, 2025Updated last year
Alternatives and similar repositories for odpo
Users that are interested in odpo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12Dec 13, 2022Updated 3 years ago
- ☆21Nov 19, 2023Updated 2 years ago
- ☆10Sep 28, 2018Updated 7 years ago
- A Structured Span Selector (NAACL 2022). A structured span selector with a WCFG for span selection tasks (coreference resolution, semanti…☆21Jul 11, 2022Updated 3 years ago
- ☆13Dec 4, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for EMNLP2023 paper "MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter".☆12Dec 27, 2023Updated 2 years ago
- Offline RL experiments☆15Oct 1, 2022Updated 3 years ago
- This repository contains the joint use of CPO and SimPO method for better reference-free preference learning methods.☆57Aug 13, 2024Updated last year
- A Qwen .5B reasoning model trained on OpenR1-Math-220k☆14Oct 11, 2025Updated 7 months ago
- 统计微信朋友圈送出的赞票与得到的赞票人员比例☆11May 3, 2016Updated 10 years ago
- [ICML 2025] Official code of "AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization"☆31Jan 10, 2026Updated 4 months ago
- Official repo for vidar and vidarc: video foundation model for robotics.☆39Dec 22, 2025Updated 5 months ago
- Pytorch implementation of BEAR in "Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction"☆11Oct 29, 2019Updated 6 years ago
- Code for ACL '19 paper: Towards Improving Neural Named Entity Recognition with Gazetteers☆32Jul 2, 2021Updated 4 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Code for Findings of ACL 2023 paper "Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency …☆10Jul 18, 2023Updated 2 years ago
- Source code of NeurIPS 2022 paper “Co-Modality Graph Contrastive Learning for Imbalanced Node Classification”☆21Jan 15, 2023Updated 3 years ago
- ☆15Feb 26, 2025Updated last year
- Noise Reduction Methods for Distantly Supervised Biomedical Relation Extraction☆11Oct 25, 2017Updated 8 years ago
- SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters (ICLR 2025)☆17Aug 22, 2025Updated 9 months ago
- Neo4j 大规模 三元组 CVS 导入进数据库☆11Jul 31, 2020Updated 5 years ago
- Manipulate tensors with PackedSequence and CattedSequence☆12Jan 4, 2026Updated 4 months ago
- [NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward☆954Feb 16, 2025Updated last year
- Basic PyTorch Implementation of 'Neural Architecture Search with Reinforcement Learning' (https://arxiv.org/abs/1611.01578)☆13Feb 24, 2018Updated 8 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official code and data of "3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset"☆12Dec 8, 2024Updated last year
- A repo to design basic Policy Gradient labs☆12Jul 6, 2023Updated 2 years ago
- Repository containing the open source code of works published at the FBK MT unit.☆60Mar 19, 2026Updated 2 months ago
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆51Oct 23, 2024Updated last year
- [ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization☆32Jan 7, 2026Updated 4 months ago
- Official implementation for "ALI-Agent: Assessing LLMs'Alignment with Human Values via Agent-based Evaluation"☆21Jan 31, 2026Updated 3 months ago
- A Python gradient-descent implementation of the Neighborhood Components Analysis (NCA) method for metric learning.☆16Jan 10, 2017Updated 9 years ago
- Code for the EACL 2024 paper: "Small Language Models Improve Giants by Rewriting Their Outputs"☆12Apr 20, 2024Updated 2 years ago
- Python package to augment multilingual data☆15Feb 15, 2023Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Source code and dataset for ECML-PKDD 2020 paper "Hierarchical Interaction Networks with Rethinking Mechanism for Document-level Sentimen…☆10Jul 28, 2020Updated 5 years ago
- Code for "Improving Translation Faithfulness of Large Language Models via Augmenting Instructions"☆12Aug 26, 2023Updated 2 years ago
- My own playground for PLP (Programming Language Processing) using DeepLearning techniques☆19Apr 12, 2023Updated 3 years ago
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated last year
- Generic PyTorch implementation of einsum that supports different semirings☆50Dec 4, 2025Updated 5 months ago
- Classifying Relations by Ranking with Convolutional Neural Networks☆12May 22, 2019Updated 7 years ago
- Codes accompanying the paper "Offline Reinforcement Learning with Value-Based Episodic Memory" (ICLR 2022 https://arxiv.org/abs/2110.0979…☆15Mar 9, 2022Updated 4 years ago