LAMDASZ-ML / ChinaTravel
ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel Planning
☆14Updated this week
Alternatives and similar repositories for ChinaTravel
Users that are interested in ChinaTravel are comparing it to the libraries listed below
Sorting:
- Code for NeurIPS 2024 paper "Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs"☆34Updated 2 months ago
- ☆45Updated 3 months ago
- code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning☆40Updated last year
- Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)☆28Updated last year
- Implementation of ICML 2023 paper: Future-conditioned Unsupervised Pretraining for Decision Transformer☆27Updated last year
- ☆28Updated last week
- ☆29Updated last year
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆58Updated last month
- ✨✨Latest Advances on Neuro-Symbolic Learning in the era of Large Language Models☆77Updated last month
- Exploring techniques to generate diverse conventions in multi-agent settings☆14Updated last year
- Rewarded soups official implementation☆57Updated last year
- Implementation of the paper "Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation"☆16Updated 7 months ago
- Offline RLHF codebase implementation for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human …☆37Updated last year
- ☆15Updated 6 months ago
- Accompanies the EMNLP 2024 paper: "Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions". This repo featur…☆19Updated 3 months ago
- Code for the paper: Dense Reward for Free in Reinforcement Learning from Human Feedback (ICML 2024) by Alex J. Chan, Hao Sun, Samuel Holt…☆29Updated 9 months ago
- The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".☆27Updated last month
- Official code for the paper: WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents☆34Updated last week
- Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"☆29Updated last year
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆76Updated 8 months ago
- Code for "A Sober Look at Progress in Language Model Reasoning" paper☆45Updated this week
- Neuro-Symbolic Hierarchical Rule Induction☆13Updated 2 years ago
- ☆50Updated 8 months ago
- Code for EMNLP'24 paper - On Diversified Preferences of Large Language Model Alignment☆16Updated 9 months ago
- ☆14Updated last year
- This repository is the official implementation of the TRAC optimizer in Fast TRAC: A Parameter-Free Optimizer for Lifelong Reinforcement …☆25Updated 2 weeks ago
- ☆54Updated 6 months ago
- ☆21Updated 6 months ago
- SCoRe: Training Language Models to Self-Correct via Reinforcement Learning☆9Updated 3 months ago
- A repo for RLHF training and BoN over LLMs, with support for reward model ensembles.☆43Updated 4 months ago