Implementation of Direct Preference Optimization
☆17Jul 17, 2023Updated 2 years ago
Alternatives and similar repositories for DPO
Users that are interested in DPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official code for ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning (AAAI'24)☆17Feb 10, 2024Updated 2 years ago
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- [NeurIPS'20] Code for the paper "Offline Imitation Learning with a Misspecified Simulator"☆12Nov 24, 2021Updated 4 years ago
- Project Euler GPT Resolver☆10Feb 12, 2024Updated 2 years ago
- Official implementation of Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Reweighting☆16Feb 14, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆27Oct 30, 2025Updated 5 months ago
- PyTorch implementations for Offline Preference-Based RL (PbRL) algorithms☆21Mar 24, 2025Updated last year
- Standalone library of frequently-used wrappers for dm_env environments.☆19Jul 9, 2024Updated last year
- ☆14Jun 24, 2024Updated last year
- Starter template for your ML/AI projects (uv package manager, RestAPI with FastAPI and Dockerfile support)☆34Jan 13, 2025Updated last year
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆31Mar 12, 2024Updated 2 years ago
- ☆12Sep 7, 2024Updated last year
- FID computation in Jax/Flax.☆29Jul 17, 2024Updated last year
- Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning☆29Feb 21, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- DiCE: The Infinitely Differentiable Monte-Carlo Estimator☆32Jul 28, 2023Updated 2 years ago
- Experiments to train transformer network to master reinforcement learning environments.☆32Mar 14, 2021Updated 5 years ago
- Mamba support for transformer lens☆19Sep 17, 2024Updated last year
- ☆30Mar 1, 2022Updated 4 years ago
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆15Jun 21, 2024Updated last year
- EleutherAI ML Performance reading group repository (slides, meeting recordings, annotated papers)☆31Mar 20, 2026Updated 3 weeks ago
- Code for ICLR 2025 Paper "GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment"☆21Feb 10, 2025Updated last year
- Database for International Physics Olympiads☆10Sep 22, 2025Updated 6 months ago
- Some notes and solutions to "Machine Learning" authored by Zhi-Hua Zhou☆11Jul 20, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for CascadeBERT, Findings of EMNLP 2021☆12Mar 30, 2022Updated 4 years ago
- Repo for paper: Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge☆14Feb 20, 2024Updated 2 years ago
- Barebones Rust EVM Implementation☆12Feb 9, 2022Updated 4 years ago
- ☆10Sep 19, 2023Updated 2 years ago
- Multilingual acoustic word embedding approaches applied and evaluated on GlobalPhone data.☆11Nov 3, 2020Updated 5 years ago
- Invertible neural network for gravitational wave parameter estimation☆11Nov 22, 2022Updated 3 years ago
- Imitation learning from multiple experts☆13Aug 29, 2022Updated 3 years ago
- ☆15May 6, 2022Updated 3 years ago
- On-Chain Experiment Hub☆14Mar 31, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆10Nov 27, 2019Updated 6 years ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆67Dec 10, 2024Updated last year
- Triton implementation of GPT/LLAMA☆21Aug 28, 2024Updated last year
- Code for Powderworld: A Platform for Understanding Generalization via Rich Task Distributions☆74Aug 31, 2024Updated last year
- ☆15Jan 8, 2023Updated 3 years ago
- Code for ICLR 2022 Paper (HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning)☆12Nov 28, 2023Updated 2 years ago
- (Unofficial) Implementation of the paper "Cross-View Tracking for Multi-Human 3D Pose Estimation at over 100 FPS" Chen et al.☆14Dec 25, 2024Updated last year