davendw49 / llm_training_full_stack
π Full Stack Practice of the Large Language Model Training @ RLChina 2024
β34Updated 3 months ago
Alternatives and similar repositories for llm_training_full_stack:
Users that are interested in llm_training_full_stack are comparing it to the libraries listed below
- A comprehensive list of PAPERS, CODEBASES, and, DATASETS on Decision Making using Foundation Models including LLMs and VLMs.β348Updated 9 months ago
- An index of algorithms for reinforcement learning from human feedback (rlhf))β91Updated 9 months ago
- [NeurIPS 2023] Large Language Models Are Semi-Parametric Reinforcement Learning Agentsβ33Updated 9 months ago
- Monitoring recent cross-research on LLM & RL on arXiv for control. If there are good papers, PRs are welcome.β273Updated 4 months ago
- A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language Modelsβ27Updated 2 months ago
- Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)β167Updated last year
- Direct preference optimization with f-divergences.β13Updated 2 months ago
- β36Updated 5 months ago
- β53Updated 6 months ago
- Preference Transformer: Modeling Human Preferences using Transformers for RL (ICLR2023 Accepted)β157Updated last year
- β69Updated last year
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"β125Updated 10 months ago
- LLM-PySC2 is NKAI Decision Team and NUDT Decision Team's Python component of the StarCraft II LLM Decision Environment. It exposes Deepmiβ¦β102Updated 2 weeks ago
- Implementation of TWOSOMEβ62Updated 3 weeks ago
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" prβ¦β88Updated 11 months ago
- A collection of LLM with RL papersβ249Updated 9 months ago
- β24Updated 10 months ago
- Use seaborn to draw RL pictureβ25Updated last year
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.β284Updated 5 months ago
- β58Updated last year
- AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedbackβ101Updated last year
- β18Updated 5 months ago
- A large-scale multi-modal pre-trained modelβ130Updated last year
- Natural Language Reinforcement Learningβ69Updated last month
- β90Updated 2 years ago
- Implemention of the Decision-Pretrained Transformer (DPT) from the paper Supervised Pretraining Can Learn In-Context Reinforcement Learniβ¦β57Updated 8 months ago
- ProAgent: Building Proactive Cooperative Agents with Large Language Modelsβ69Updated 9 months ago
- Official code repository for Prompt-DT.β102Updated 2 years ago
- β106Updated last year
- β12Updated 11 months ago