Implementation of Direct Preference Optimization
☆17Jul 17, 2023Updated 2 years ago
Alternatives and similar repositories for DPO
Users that are interested in DPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official code for ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning (AAAI'24)☆17Feb 10, 2024Updated 2 years ago
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- Project Euler GPT Resolver☆10Feb 12, 2024Updated 2 years ago
- ☆28Oct 30, 2025Updated 7 months ago
- PyTorch implementations for Offline Preference-Based RL (PbRL) algorithms☆21Mar 24, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Standalone library of frequently-used wrappers for dm_env environments.☆19Jul 9, 2024Updated last year
- GPT* - Training faster small transformers using ALiBi, Parallel Residual Connections and more!☆20Oct 29, 2022Updated 3 years ago
- ☆14Jun 24, 2024Updated last year
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆31Mar 12, 2024Updated 2 years ago
- ☆13Sep 7, 2024Updated last year
- Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning☆29Feb 21, 2022Updated 4 years ago
- DiCE: The Infinitely Differentiable Monte-Carlo Estimator☆32Jul 28, 2023Updated 2 years ago
- ☆30Mar 1, 2022Updated 4 years ago
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆15Jun 21, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- EleutherAI ML Performance reading group repository (slides, meeting recordings, annotated papers)☆33Mar 20, 2026Updated 2 months ago
- Database for International Physics Olympiads☆11Sep 22, 2025Updated 8 months ago
- Some notes and solutions to "Machine Learning" authored by Zhi-Hua Zhou☆11Jul 20, 2021Updated 4 years ago
- ☆22Feb 4, 2026Updated 4 months ago
- In this course navigates through the LLMOps pipeline, enabling you to preprocess training data for supervised fine-tuning and deploy cust…☆15Feb 13, 2024Updated 2 years ago
- Code for the paper: Sparsely Changing Latent States for Prediction and Planning in Partially Observable Domains☆11Nov 12, 2021Updated 4 years ago
- self-adaptive in-context learning☆45May 5, 2023Updated 3 years ago
- Code for ICLR 2025 Paper "GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment"☆24Feb 10, 2025Updated last year
- Faster RCNN using TensorFlow☆10Jul 31, 2022Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- 🛠Robust SSH: auto-reconnect SSH session that preserves your running shell and command. Intuitive, no server-side setup, aimed at simplic…☆13Nov 14, 2025Updated 7 months ago
- Barebones Rust EVM Implementation☆12Feb 9, 2022Updated 4 years ago
- A cog model for the all-mpnet-base-v2 sentence-transformers embedding model.☆15Jan 3, 2024Updated 2 years ago
- Invertible neural network for gravitational wave parameter estimation☆11Nov 22, 2022Updated 3 years ago
- Imitation learning from multiple experts☆13Aug 29, 2022Updated 3 years ago
- A set of tools for use with the huff language.☆21Jun 24, 2022Updated 3 years ago
- ☆15May 6, 2022Updated 4 years ago
- On-Chain Experiment Hub☆14Mar 31, 2022Updated 4 years ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆66Dec 10, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code for Powderworld: A Platform for Understanding Generalization via Rich Task Distributions☆74Aug 31, 2024Updated last year
- ☆15Jan 8, 2023Updated 3 years ago
- An implementation of Variational State Tabulation, from the paper here: https://arxiv.org/abs/1802.04325.☆14Feb 25, 2024Updated 2 years ago
- programmable e-paper tag with RFID☆10Jul 12, 2023Updated 2 years ago
- Code for ICLR 2022 Paper (HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning)☆12Nov 28, 2023Updated 2 years ago
- awesome video-based self-supervised learning methods in recently years☆10Nov 26, 2020Updated 5 years ago
- ☆17Aug 29, 2022Updated 3 years ago