jhejna / cplLinks

Code for Contrastive Preference Learning (CPL)

☆173

Alternatives and similar repositories for cpl

Users that are interested in cpl are comparing it to the libraries listed below

Sorting:

abdulhaim / LMRL-Gym
☆98Updated last year
snu-mllab / DPPO
Official implementation of "Direct Preference-based Policy Optimization without Reward Modeling" (NeurIPS 2023)
☆42Updated 11 months ago
microsoft / SmartPlay
SmartPlay is a benchmark for Large Language Models (LLMs). Uses a variety of games to test various important LLM capabilities as agents. …
☆140Updated last year
flowersteam / Grounding_LLMs_with_online_RL
We perform functional grounding of LLMs' knowledge in BabyAI-Text
☆266Updated 10 months ago
facebookresearch / rlfh-gen-div
This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity
☆44Updated last year
McGill-NLP / VinePPO
Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"
☆166Updated last month
pickxiguapi / Uni-RLHF-Platform
Uni-RLHF platform for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024…
☆36Updated 7 months ago
szxiangjn / world-model-for-language-model
☆131Updated last year
facebookresearch / mtm
MTM Masked Trajectory Models for Prediction, Representation, and Control.
☆156Updated 2 years ago
ShuangLI59 / Pre-Trained-Language-Models-for-Interactive-Decision-Making
Pre-Trained Language Models for Interactive Decision-Making [NeurIPS 2022]
☆127Updated 3 years ago
Timothyxxx / WorldModelPapers
Paper collections of the continuous effort start from World Models.
☆176Updated last year
ZhaolinGao / REBEL
Reinforcement Learning via Regressing Relative Rewards
☆34Updated 7 months ago
csmile-1006 / PreferenceTransformer
Preference Transformer: Modeling Human Preferences using Transformers for RL (ICLR2023 Accepted)
☆163Updated last year
minaek / reward_design_with_llms
☆220Updated 2 years ago
alexrame / rewardedsoups
Rewarded soups official implementation
☆58Updated last year
shiqiangw / iclr-scores
☆54Updated 8 months ago
YifeiZhou02 / ArCHer
Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"
☆181Updated 2 months ago
waterhorse1 / Natural-language-RL
Natural Language Reinforcement Learning
☆90Updated 6 months ago
Cornell-RL / tril
☆127Updated last year
haotiansun14 / AdaPlanner
AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedback
☆109Updated 3 months ago
bigai-nlco / langsuite
Official Repo of LangSuitE
☆84Updated 11 months ago
Sea-Snell / Implicit-Language-Q-Learning
Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"
☆208Updated last year
bigai-ai / civrealm
CivRealm is an interactive environment for the open-source strategy game Freeciv-web based on Freeciv, a Civilization-inspired game.
☆116Updated 10 months ago
dunnolab / awesome-in-context-rl
Awesome In-Context RL: A curated list of In-Context Reinforcement Learning - - —
☆201Updated 3 weeks ago
xlang-ai / text2reward
[ICLR 2024 Spotlight] Code for the paper "Text2Reward: Reward Shaping with Language Models for Reinforcement Learning"
☆170Updated 7 months ago
ml-jku / L2M
Learning to Modulate pre-trained Models in RL (Decision Transformer, LoRA, Fine-tuning)
☆59Updated 9 months ago
mxu34 / prompt-dt
Official code repository for Prompt-DT.
☆113Updated 2 years ago
DeckardAgent / deckard
Official implementation of the DECKARD Agent from the paper "Do Embodied Agents Dream of Pixelated Sheep?"
☆93Updated 2 years ago
micahcarroll / uniMASK
Codebase for "Uni[MASK]: Unified Inference in Sequential Decision Problems"
☆56Updated last year
flowersteam / lamorel
Lamorel is a Python library designed for RL practitioners eager to use Large Language Models (LLMs).
☆234Updated 8 months ago