l294265421 / ChatGPT-Techniques-Introduction-for-Everyone

ChatGPT技术介绍

☆21

Alternatives and similar repositories for ChatGPT-Techniques-Introduction-for-Everyone:

Users that are interested in ChatGPT-Techniques-Introduction-for-Everyone are comparing it to the libraries listed below

mansicer / Q-Adapter
Implementation of the paper "Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation"
☆13Updated 4 months ago
ttumiel / minRLHF
Minimal RLHF implementation built on top of minGPT.
☆29Updated 7 months ago
liziniu / policy_optimization
Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)
☆26Updated last year
Cornell-RL / drpo
Dateset Reset Policy Optimization
☆30Updated 10 months ago
swtheing / PF-PPO-RLHF
☆30Updated 5 months ago
holarissun / Prompt-OIRL
code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
☆36Updated 11 months ago
TianHongZXY / CoRe
[ACL 2023] Solving Math Word Problems via Cooperative Reasoning induced Language Models (LLMs + MCTS + Self-Improvement)
☆48Updated last year
typoverflow / WiseRL
PyTorch implementations for Offline Preference-Based RL (PbRL) algorithms
☆19Updated this week
Linear95 / DSP
Domain-specific preference (DSP) data and customized RM fine-tuning.
☆24Updated 11 months ago
pickxiguapi / Uni-RLHF-Platform
Uni-RLHF platform for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024…
☆33Updated 3 months ago
pickxiguapi / Clean-Offline-RLHF
Offline RLHF codebase implementation for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human …
☆34Updated 10 months ago
zhxieml / PDT
Implementation of ICML 2023 paper: Future-conditioned Unsupervised Pretraining for Decision Transformer
☆27Updated last year
xionghuichen / MAPLE
The Official Code for Offline Model-based Adaptable Policy Learning (NeurIPS'21 & TPAMI)
☆22Updated last year
ReinholdM / Papers-of-Offline-RL
Related papers for offline reforcement learning (we mainly focus on representation and sequence modeling and conventional offline RL)
☆18Updated 2 years ago
LAMDA-RL / ACT
Official code for ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning (AAAI'24)
☆13Updated last year
Alibaba-NLP / RankingGPT
code for paper 《RankingGPT: Empowering Large Language Models in Text Ranking with Progressive Enhancement》
☆31Updated last year
hwang-ua / inac_pytorch
☆18Updated last year
LAMDA-RL / PRDC
Author's PyTorch implementation of ICML'23 paper "Policy Regularization with Dataset Constraint for Offline Reinforcement Learning" for D…
☆17Updated 3 months ago
lns / memoire
☆18Updated 5 years ago
syncdoth / Chain-of-Hindsight-PyTorch
Unofficial implementation of Chain of Hindsight (https://arxiv.org/abs/2302.02676) using pytorch and huggingface Trainers.
☆11Updated last year
princeton-nlp / WhatICLLearns
[ACL 2023 Findings] What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning
☆21Updated last year
Stanford-ILIAD / Diverse-Conventions
Exploring techniques to generate diverse conventions in multi-agent settings
☆12Updated last year
FanmingL / SmartLogger
☆11Updated 9 months ago
xingchenwan / bgpbt
[AutoML'22] Bayesian Generational Population-based Training (BG-PBT)
☆27Updated 2 years ago
thu-ml / CEURL
Official implementation for "PEAC: Unsupervised Pre-training for Cross-Embodiment Reinforcement Learning" (NeurIPS 2024)
☆12Updated 4 months ago
vicgalle / zero-shot-reward-models
ZYN: Zero-Shot Reward Models with Yes-No Questions
☆33Updated last year
OpenLLMAI / OpenLLMDE
OpenLLMDE: An open source data engineering framework for LLMs
☆17Updated last year
sfujim / SR-DICE
Author's PyTorch implementation of SR-DICE for marginalized importance sampling
☆15Updated 3 years ago
wenyudu / MIGU
[EMNLP 2024 Findings] Unlocking Continual Learning Abilities in Language Models
☆23Updated 4 months ago
ahjwang / messenger-emma
Implements the Messenger environment and EMMA model.
☆23Updated last year