l294265421 / ChatGPT-Techniques-Introduction-for-Everyone
ChatGPT技术介绍
☆21Updated last year
Alternatives and similar repositories for ChatGPT-Techniques-Introduction-for-Everyone:
Users that are interested in ChatGPT-Techniques-Introduction-for-Everyone are comparing it to the libraries listed below
- Implementation of the paper "Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation"☆13Updated 4 months ago
- Minimal RLHF implementation built on top of minGPT.☆29Updated 7 months ago
- Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)☆26Updated last year
- Dateset Reset Policy Optimization☆30Updated 10 months ago
- ☆30Updated 5 months ago
- code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning☆36Updated 11 months ago
- [ACL 2023] Solving Math Word Problems via Cooperative Reasoning induced Language Models (LLMs + MCTS + Self-Improvement)☆48Updated last year
- PyTorch implementations for Offline Preference-Based RL (PbRL) algorithms☆19Updated this week
- Domain-specific preference (DSP) data and customized RM fine-tuning.☆24Updated 11 months ago
- Uni-RLHF platform for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024…☆33Updated 3 months ago
- Offline RLHF codebase implementation for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human …☆34Updated 10 months ago
- Implementation of ICML 2023 paper: Future-conditioned Unsupervised Pretraining for Decision Transformer☆27Updated last year
- The Official Code for Offline Model-based Adaptable Policy Learning (NeurIPS'21 & TPAMI)☆22Updated last year
- Related papers for offline reforcement learning (we mainly focus on representation and sequence modeling and conventional offline RL)☆18Updated 2 years ago
- Official code for ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning (AAAI'24)☆13Updated last year
- code for paper 《RankingGPT: Empowering Large Language Models in Text Ranking with Progressive Enhancement》☆31Updated last year
- ☆18Updated last year
- Author's PyTorch implementation of ICML'23 paper "Policy Regularization with Dataset Constraint for Offline Reinforcement Learning" for D…☆17Updated 3 months ago
- ☆18Updated 5 years ago
- Unofficial implementation of Chain of Hindsight (https://arxiv.org/abs/2302.02676) using pytorch and huggingface Trainers.☆11Updated last year
- [ACL 2023 Findings] What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning☆21Updated last year
- Exploring techniques to generate diverse conventions in multi-agent settings☆12Updated last year
- ☆11Updated 9 months ago
- [AutoML'22] Bayesian Generational Population-based Training (BG-PBT)☆27Updated 2 years ago
- Official implementation for "PEAC: Unsupervised Pre-training for Cross-Embodiment Reinforcement Learning" (NeurIPS 2024)☆12Updated 4 months ago
- ZYN: Zero-Shot Reward Models with Yes-No Questions☆33Updated last year
- OpenLLMDE: An open source data engineering framework for LLMs☆17Updated last year
- Author's PyTorch implementation of SR-DICE for marginalized importance sampling☆15Updated 3 years ago
- [EMNLP 2024 Findings] Unlocking Continual Learning Abilities in Language Models☆23Updated 4 months ago
- Implements the Messenger environment and EMMA model.☆23Updated last year