Implementation of Direct Preference Optimization
☆17Jul 17, 2023Updated 2 years ago
Alternatives and similar repositories for DPO
Users that are interested in DPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official code for ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning (AAAI'24)☆17Feb 10, 2024Updated 2 years ago
- Project Euler GPT Resolver☆10Feb 12, 2024Updated 2 years ago
- Official implementation of Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Reweighting☆16Feb 14, 2024Updated 2 years ago
- ☆27Oct 30, 2025Updated 5 months ago
- Source for paper, "Data organization in spreadsheets"☆22Sep 30, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Standalone library of frequently-used wrappers for dm_env environments.☆19Jul 9, 2024Updated last year
- ☆14Jun 24, 2024Updated last year
- Starter template for your ML/AI projects (uv package manager, RestAPI with FastAPI and Dockerfile support)☆34Jan 13, 2025Updated last year
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆31Mar 12, 2024Updated 2 years ago
- ☆12Sep 7, 2024Updated last year
- ☆16Mar 22, 2024Updated 2 years ago
- Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning☆29Feb 21, 2022Updated 4 years ago
- Experiments to train transformer network to master reinforcement learning environments.☆32Mar 14, 2021Updated 5 years ago
- Mamba support for transformer lens☆19Sep 17, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆30Mar 1, 2022Updated 4 years ago
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆15Jun 21, 2024Updated last year
- EleutherAI ML Performance reading group repository (slides, meeting recordings, annotated papers)☆31Mar 20, 2026Updated 3 weeks ago
- RLA is a tool for managing your RL experiments automatically☆71Feb 7, 2023Updated 3 years ago
- ☆22Feb 4, 2026Updated 2 months ago
- Code for CascadeBERT, Findings of EMNLP 2021☆12Mar 30, 2022Updated 4 years ago
- Repo for paper: Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge☆14Feb 20, 2024Updated 2 years ago
- ☆10Jan 20, 2023Updated 3 years ago
- Code for the paper: Sparsely Changing Latent States for Prediction and Planning in Partially Observable Domains☆10Nov 12, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- (AAAI'2019) The codes, models, logs, and data for an extended paper of the original paper "On Reinforcement Learning for Full-length Game…☆33Oct 5, 2022Updated 3 years ago
- Faster RCNN using TensorFlow☆10Jul 31, 2022Updated 3 years ago
- ☆13Feb 1, 2024Updated 2 years ago
- The evaluation code for A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5☆53Jan 18, 2026Updated 2 months ago
- Multilingual acoustic word embedding approaches applied and evaluated on GlobalPhone data.☆11Nov 3, 2020Updated 5 years ago
- A cog model for the all-mpnet-base-v2 sentence-transformers embedding model.☆15Jan 3, 2024Updated 2 years ago
- Probabilistic single-cell pseudotime with Edward+Tensorflow☆12Oct 5, 2017Updated 8 years ago
- This is MPE-pytorch, fix some bugs.☆11Apr 26, 2020Updated 5 years ago
- Imitation learning from multiple experts☆13Aug 29, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A set of tools for use with the huff language.☆21Jun 24, 2022Updated 3 years ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆67Dec 10, 2024Updated last year
- Triton implementation of GPT/LLAMA☆21Aug 28, 2024Updated last year
- Processing WGS aDNA data using the ReichLab protocol☆13Mar 8, 2019Updated 7 years ago
- Course Info for VIP-GEAI☆11Apr 11, 2024Updated 2 years ago
- programmable e-paper tag with RFID☆10Jul 12, 2023Updated 2 years ago
- Code for ICLR 2022 Paper (HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning)☆12Nov 28, 2023Updated 2 years ago