patrick-tssn / LM-Research-Hub
Language Modeling Research Hub, a comprehensive compendium for enthusiasts and scholars delving into the fascinating realm of language models (LMs), with a particular focus on large language models (LLMs)
☆19Updated last month
Alternatives and similar repositories for LM-Research-Hub:
Users that are interested in LM-Research-Hub are comparing it to the libraries listed below
- ☆128Updated 9 months ago
- Official Repo of LangSuitE☆83Updated 8 months ago
- Paper collections of methods that using language to interact with environment, including interact with real world, simulated world or WWW…☆127Updated last year
- Code for ACL2024 paper - Adversarial Preference Optimization (APO).☆54Updated 11 months ago
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain☆102Updated last year
- Data and code for the ICLR 2023 paper "Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning".☆152Updated last year
- ☆24Updated last year
- [ICLR2025 Spotlight] Agent Trajectory Synthesis via Guiding Replay with Web Tutorials☆30Updated 2 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆138Updated 6 months ago
- 🤖ConvRe🤯: An Investigation of LLMs’ Inefficacy in Understanding Converse Relations (EMNLP 2023)☆23Updated last year
- ☁️ KUMO: Generative Evaluation of Complex Reasoning in Large Language Models☆17Updated last week
- Directional Preference Alignment☆57Updated 7 months ago
- Code accompanying the paper "Noise Contrastive Alignment of Language Models with Explicit Rewards" (NeurIPS 2024)☆51Updated 5 months ago
- This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity☆43Updated last year
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆38Updated last year
- Natural Language Reinforcement Learning☆87Updated 4 months ago
- Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment☆69Updated last year
- ☆24Updated 10 months ago
- Domain-specific preference (DSP) data and customized RM fine-tuning.☆25Updated last year
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆74Updated 10 months ago
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆138Updated 2 months ago
- ☆44Updated 6 months ago
- my commonly-used tools☆53Updated 3 months ago
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"☆49Updated 6 months ago
- Paper collections of the continuous effort start from World Models.☆172Updated 10 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆120Updated 7 months ago
- Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)☆181Updated last year
- The code for Consistent In-Context Editing, an approach for tuning language models through contextual distributions, overcoming the limit…☆27Updated last month
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆44Updated 3 weeks ago
- ☆91Updated 10 months ago