Implementation of Direct Preference Optimization
☆17Jul 17, 2023Updated 2 years ago
Alternatives and similar repositories for DPO
Users that are interested in DPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official code for ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning (AAAI'24)☆17Feb 10, 2024Updated 2 years ago
- [NeurIPS'20] Code for the paper "Offline Imitation Learning with a Misspecified Simulator"☆12Nov 24, 2021Updated 4 years ago
- Project Euler GPT Resolver☆10Feb 12, 2024Updated 2 years ago
- Official implementation of Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Reweighting☆16Feb 14, 2024Updated 2 years ago
- ☆28Oct 30, 2025Updated 8 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Source for paper, "Data organization in spreadsheets"☆22Sep 30, 2021Updated 4 years ago
- Standalone library of frequently-used wrappers for dm_env environments.☆19Jul 9, 2024Updated last year
- Code for MOBILE: Model-Bellman Inconsistency Penalized Offline Policy Optimization☆22Apr 17, 2024Updated 2 years ago
- ☆14Sep 7, 2024Updated last year
- Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning☆29Feb 21, 2022Updated 4 years ago
- Experiments to train transformer network to master reinforcement learning environments.☆32Mar 14, 2021Updated 5 years ago
- ☆30Mar 1, 2022Updated 4 years ago
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆15Jun 21, 2024Updated 2 years ago
- RLA is a tool for managing your RL experiments automatically☆71Feb 7, 2023Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Database for International Physics Olympiads☆11Sep 22, 2025Updated 9 months ago
- Some notes and solutions to "Machine Learning" authored by Zhi-Hua Zhou☆11Jul 20, 2021Updated 4 years ago
- Repo for paper: Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge☆14Feb 20, 2024Updated 2 years ago
- Code for the paper: Sparsely Changing Latent States for Prediction and Planning in Partially Observable Domains☆11Nov 12, 2021Updated 4 years ago
- self-adaptive in-context learning☆45May 5, 2023Updated 3 years ago
- Code for ICLR 2025 Paper "GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment"☆24Feb 10, 2025Updated last year
- (AAAI'2019) The codes, models, logs, and data for an extended paper of the original paper "On Reinforcement Learning for Full-length Game…☆34Oct 5, 2022Updated 3 years ago
- Faster RCNN using TensorFlow☆10Jul 31, 2022Updated 3 years ago
- Distributed RL Implementation using Pytorch and Ray (ApeX(Ape-X), A3C, Distributed-PPO(DPPO), Impala)☆27Jun 8, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Neural Networks exam project. Machine learning algorithm: implementation of FGSM and JSMA attacks by Goodfellow and Papernot.☆16Jan 13, 2026Updated 5 months ago
- ☆13Feb 1, 2024Updated 2 years ago
- 🛠Robust SSH: auto-reconnect SSH session that preserves your running shell and command. Intuitive, no server-side setup, aimed at simplic…☆13Nov 14, 2025Updated 7 months ago
- The evaluation code for A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5☆53Jan 18, 2026Updated 5 months ago
- ☆10Sep 19, 2023Updated 2 years ago
- CEVAE(Causal Effect Variational AutoEncoder) written with pytorch and pyro.☆10Feb 15, 2021Updated 5 years ago
- This is MPE-pytorch, fix some bugs.☆11Apr 26, 2020Updated 6 years ago
- General purpose hits (page views) counter written in Node.js backed by filesystem. (MVP)☆14Oct 7, 2022Updated 3 years ago
- On-Chain Experiment Hub☆14Mar 31, 2022Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆15May 6, 2022Updated 4 years ago
- ☆10Nov 27, 2019Updated 6 years ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆67Dec 10, 2024Updated last year
- ☆15Jan 8, 2023Updated 3 years ago
- An implementation of Variational State Tabulation, from the paper here: https://arxiv.org/abs/1802.04325.☆14Feb 25, 2024Updated 2 years ago
- Course Info for VIP-GEAI☆11Apr 11, 2024Updated 2 years ago
- programmable e-paper tag with RFID☆10Jul 12, 2023Updated 2 years ago