patrick-tssn / LM-Research-Hub
Language Modeling Research Hub, a comprehensive compendium for enthusiasts and scholars delving into the fascinating realm of language models (LMs), with a particular focus on large language models (LLMs)
☆18Updated last month
Alternatives and similar repositories for LM-Research-Hub:
Users that are interested in LM-Research-Hub are comparing it to the libraries listed below
- ☆125Updated 8 months ago
- Official Repo of LangSuitE☆82Updated 6 months ago
- Paper collections of methods that using language to interact with environment, including interact with real world, simulated world or WWW…☆126Updated last year
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain☆102Updated 11 months ago
- Data and code for the ICLR 2023 paper "Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning".☆149Updated last year
- Code for ACL2024 paper - Adversarial Preference Optimization (APO).☆51Updated 9 months ago
- ☆44Updated last year
- [ICLR 2024 Spotlight] Code for the paper "Text2Reward: Reward Shaping with Language Models for Reinforcement Learning"☆153Updated 2 months ago
- Natural Language Reinforcement Learning☆77Updated 2 months ago
- GenRM-CoT: Data release for verification rationales☆49Updated 4 months ago
- Paper collections of the continuous effort start from World Models.☆168Updated 8 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆127Updated 4 months ago
- Repo of paper "Free Process Rewards without Process Labels"☆132Updated last week
- ☆23Updated 9 months ago
- The source code of the paper "Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Pla…☆85Updated 7 months ago
- ☆80Updated 8 months ago
- Code accompanying the paper "Noise Contrastive Alignment of Language Models with Explicit Rewards" (NeurIPS 2024)☆49Updated 4 months ago
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆130Updated last month
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆161Updated last month
- ☆43Updated 4 months ago
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆70Updated 6 months ago
- [NeurIPS 2023] Large Language Models Are Semi-Parametric Reinforcement Learning Agents☆34Updated 10 months ago
- Pre-Trained Language Models for Interactive Decision-Making [NeurIPS 2022]☆120Updated 2 years ago
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆55Updated 4 months ago
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆145Updated 11 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆118Updated 6 months ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆128Updated 4 months ago
- ☆54Updated 4 months ago