sail-sg / GDPO
Graph Diffusion Policy Optimization
☆30Updated 10 months ago
Alternatives and similar repositories for GDPO:
Users that are interested in GDPO are comparing it to the libraries listed below
- Code for the tutorial/review paper for RL-based-fine-tuniing. In this code, we especially focus on the design of biological sequences li…☆103Updated 4 months ago
- SEIKO is a novel reinforcement learning method to efficiently fine-tune diffusion models in an online setting. Our methods outperform all…☆18Updated 6 months ago
- Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"☆55Updated last month
- SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator☆43Updated last month
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆64Updated 5 months ago
- ☆54Updated 2 months ago
- The code of RouterDC☆46Updated last week
- ☆94Updated 7 months ago
- [NeurIPSw'24] This repo is the official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simu…☆77Updated this week
- ☆62Updated last month
- ☆28Updated 2 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View☆46Updated 3 months ago
- ☆21Updated last month
- ☆50Updated 7 months ago
- Intriguing Properties of Data Attribution on Diffusion Models (ICLR 2024)☆27Updated last year
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆62Updated 2 months ago
- [SafeGenAi @ NeurIPS 2024] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates☆68Updated 3 months ago
- A brief and partial summary of RLHF algorithms.☆89Updated 2 months ago
- Official code implementation for the work Preference Alignment with Flow Matching (NeurIPS 2024)☆20Updated 2 months ago
- Official pytorch implementation of "Interpreting the Second-Order Effects of Neurons in CLIP"☆31Updated 2 months ago
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆58Updated 2 weeks ago
- Code for NeurIPS 2024 paper "Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs"☆20Updated last month
- ☆76Updated 6 months ago
- [NeurIPS 2024] A task generation and model evaluation system for multimodal language models.☆62Updated 2 months ago
- ☆46Updated last month
- ☆31Updated last week
- Official source code for "Graph Neural Networks for Learning Equivariant Representations of Neural Networks". In ICLR 2024 (oral).☆77Updated 6 months ago
- It is a comprehensive resource hub compiling all LLM papers accepted at the International Conference on Learning Representations (ICLR) i…☆48Updated 9 months ago
- Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents (COLM 2024)☆27Updated 6 months ago