patrick-tssn / LM-Research-Hub
Language Modeling Research Hub, a comprehensive compendium for enthusiasts and scholars delving into the fascinating realm of language models (LMs), with a particular focus on large language models (LLMs)
☆15Updated 3 weeks ago
Related projects: ⓘ
- ☆102Updated 2 months ago
- Paper collections of methods that using language to interact with environment, including interact with real world, simulated world or WWW…☆121Updated last year
- Paper collections of the continuous effort start from World Models.☆127Updated 2 months ago
- ☆76Updated last month
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆89Updated 2 months ago
- Official Repo of LangSuitE☆74Updated last month
- [ICLR 2024] Code for the paper "Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning"☆113Updated 8 months ago
- Data and code for the ICLR 2023 paper "Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning".☆138Updated 8 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆86Updated 3 months ago
- ☆24Updated 6 months ago
- Pre-Trained Language Models for Interactive Decision-Making [NeurIPS 2022]☆116Updated 2 years ago
- [ACL'2024] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆44Updated last month
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆78Updated last week
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain☆98Updated 6 months ago
- Code for ACL2024 paper - Adversarial Preference Optimization (APO).☆49Updated 3 months ago
- ☆53Updated 2 months ago
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆101Updated last month
- This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity☆35Updated 8 months ago