yaochenzhu / Rank-GRPOLinks
(ICLR'26 + Netflix) Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning
☆35Updated 2 months ago
Alternatives and similar repositories for Rank-GRPO
Users that are interested in Rank-GRPO are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] The implementation of paper "On Softmax Direct Preference Optimization for Recommendation"☆96Updated last year
- [TMLR 2025] A general framework for bridging LLMs and recommendation systems via reinforcement learning. https://arxiv.org/pdf/2503.24289☆125Updated 5 months ago
- ☆91Updated last week
- ☆14Updated 11 months ago
- Official Implementation of "Democratizing Large Language Models via Personalized Parameter-Efficient Fine-tuning" at EMNLP 2024 Main Conf…☆43Updated 6 months ago
- [ICLR 2025 Oral 🏆] The implementation of paper "Language Representations Can be What Recommenders Need: Findings and Potentials"☆97Updated 8 months ago
- [RelKD'24] Mamba4Rec: Towards Efficient Sequential Recommendation with Selective State Space Models☆121Updated 9 months ago
- Official implementation for "ALI-Agent: Assessing LLMs'Alignment with Human Values via Agent-based Evaluation"☆21Updated 5 months ago
- ☆17Updated 6 months ago
- (WWW'25 + Netflix) The first CRS that retrieves collaborative filtering knowledge with two-step context-aware reflection.☆18Updated 4 months ago
- Discriminative Constrained Optimization for Reinforcing Large Reasoning Models☆50Updated 2 months ago
- Recommender systems with large language models (Paper list)☆63Updated 2 years ago
- This is the repo for the survey of Bias and Fairness in IR with LLMs.☆59Updated 4 months ago
- 🔥🔥🔥 Latest Advances on Large Recommendation Models☆116Updated last year
- Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"☆56Updated 3 weeks ago
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆50Updated last year
- AnchorAttention: Improved attention for LLMs long-context training☆213Updated last year
- [ICML 2025] Official code of "AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization"☆29Updated 3 weeks ago
- SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis☆68Updated 6 months ago
- [NAACL 25 main] Awesome LLM Causal Reasoning is a collection of LLM-based casual reasoning works, including papers, codes and datasets.☆113Updated 4 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆51Updated last year
- Code for AAAI'25 paper: LLM-Powered User Simulator for Recommender System☆23Updated last year
- A Sober Look at Language Model Reasoning☆92Updated 2 months ago
- The official repository of paper "Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models''☆110Updated 5 months ago
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆70Updated 9 months ago
- [WSDM 2024] Official PyTorch Implementation of Linear Recurrent Units for Sequential Recommendation (LRURec)☆65Updated 11 months ago
- ☆204Updated last month
- ☆35Updated 7 months ago
- JudgeLRM: Large Reasoning Models as a Judge☆40Updated last month
- Awesome things about generative recommendation models.☆103Updated 9 months ago