Implementation of Direct Preference Optimization
☆17Jul 17, 2023Updated 2 years ago
Alternatives and similar repositories for DPO
Users that are interested in DPO are comparing it to the libraries listed below
Sorting:
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- Official code for ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning (AAAI'24)☆17Feb 10, 2024Updated 2 years ago
- ☆27Oct 30, 2025Updated 4 months ago
- ☆11Sep 7, 2024Updated last year
- Official implementation of Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Reweighting☆17Feb 14, 2024Updated 2 years ago
- GPT* - Training faster small transformers using ALiBi, Parallel Residual Connections and more!☆21Oct 29, 2022Updated 3 years ago
- PyTorch implementations for Offline Preference-Based RL (PbRL) algorithms☆21Mar 24, 2025Updated 11 months ago
- Starter template for your ML/AI projects (uv package manager, RestAPI with FastAPI and Dockerfile support)☆33Jan 13, 2025Updated last year
- Standalone library of frequently-used wrappers for dm_env environments.☆18Jul 9, 2024Updated last year
- DiCE: The Infinitely Differentiable Monte-Carlo Estimator☆32Jul 28, 2023Updated 2 years ago
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆31Mar 12, 2024Updated last year
- FID computation in Jax/Flax.☆29Jul 17, 2024Updated last year
- Source for paper, "Data organization in spreadsheets"☆22Sep 30, 2021Updated 4 years ago
- Distributed RL Implementation using Pytorch and Ray (ApeX(Ape-X), A3C, Distributed-PPO(DPPO), Impala)☆27Jun 8, 2022Updated 3 years ago
- Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning☆29Feb 21, 2022Updated 4 years ago
- ☆13Oct 5, 2025Updated 4 months ago
- Experiments to train transformer network to master reinforcement learning environments.☆32Mar 14, 2021Updated 4 years ago
- (AAAI'2019) The codes, models, logs, and data for an extended paper of the original paper "On Reinforcement Learning for Full-length Game…☆31Oct 5, 2022Updated 3 years ago
- Faster RCNN using TensorFlow☆10Jul 31, 2022Updated 3 years ago
- Build an AI bot in Discord to serve user's personalized reports on what's up in tech☆28Sep 14, 2025Updated 5 months ago
- Source code for SWIFT, an efficient reward model.☆18Jan 13, 2026Updated last month
- python port of arc90's readability bookmarklet, updated to match latest readability.js!☆19Sep 13, 2011Updated 14 years ago
- A repo containing bash scripts to deploy reinforcement learning dev environment within one click!☆10May 15, 2025Updated 9 months ago
- ☆12Jul 8, 2024Updated last year
- An extension to interact with DeSo Blockchain on few simple clicks 🥳☆12Jan 25, 2024Updated 2 years ago
- KCL Interface to UKB Project Data on Rosalind HPC cluster☆14Apr 22, 2023Updated 2 years ago
- my profile readme☆14Updated this week
- AI by AI☆11Oct 19, 2023Updated 2 years ago
- Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs) for LLMs in PyTorch☆10Aug 7, 2024Updated last year
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues☆13Feb 7, 2024Updated 2 years ago
- Reference implementation of Thin and Deep Gaussian Processes (NeurIPS 2023)☆14Nov 25, 2024Updated last year
- Vectorized implementation of a general feedforward neural network in Python☆10Jan 22, 2017Updated 9 years ago
- Python package for compressing floating-point PyTorch tensors☆13Jul 22, 2024Updated last year
- The course work repo for UoSurrey EEEM071 (2023 Spring)☆11May 9, 2023Updated 2 years ago
- Simulation code and data of the paper - cold start to improve market thickness☆12Jan 30, 2026Updated last month
- Extended Inductive Reasoning for Personalized Preference Inference from Behavioral Signals☆11Jan 8, 2026Updated last month
- GPU accelerated Perlin Noise in python☆11Oct 23, 2020Updated 5 years ago
- ☆10Sep 19, 2023Updated 2 years ago
- Code for the paper: Sparsely Changing Latent States for Prediction and Planning in Partially Observable Domains☆10Nov 12, 2021Updated 4 years ago