davendw49 / llm_training_full_stackLinks
π Full Stack Practice of the Large Language Model Training @ RLChina 2024
β39Updated 8 months ago
Alternatives and similar repositories for llm_training_full_stack
Users that are interested in llm_training_full_stack are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2023] Large Language Models Are Semi-Parametric Reinforcement Learning Agentsβ34Updated last year
- A comprehensive list of PAPERS, CODEBASES, and, DATASETS on Decision Making using Foundation Models including LLMs and VLMs.β371Updated last year
- A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language Modelsβ42Updated 2 months ago
- Code release for "Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search" published at NeurIPS '24.β11Updated 4 months ago
- A collection of LLM with RL papersβ272Updated last year
- An index of algorithms for reinforcement learning from human feedback (rlhf))β92Updated last year
- Direct preference optimization with f-divergences.β13Updated 7 months ago
- β14Updated 8 months ago
- β79Updated last year
- MARFT stands for Multi-Agent Reinforcement Fine-Tuning. This repository implements an LLM-based multi-agent reinforcement fine-tuning fraβ¦β43Updated last week
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"β125Updated last week
- [TNNLS-2024, arXiv-2023.2.10] Official repository of "A Survey on Causal Reinforcement Learning"β32Updated 2 months ago
- Monitoring recent cross-research on LLM & RL on arXiv for control. If there are good papers, PRs are welcome.β434Updated 9 months ago
- Implementation of TWOSOMEβ76Updated 5 months ago
- Awesome In-Context RL: A curated list of In-Context Reinforcement Learning - - ββ191Updated last week
- [NeurIPS 2023] We use large language models as commonsense world model and heuristic policy within Monte-Carlo Tree Search, enabling bettβ¦β277Updated 7 months ago
- Natural Language Reinforcement Learningβ89Updated 6 months ago
- Official Repository of "Learning to Reason under Off-Policy Guidance"β240Updated 3 weeks ago
- Tutorial4RL: Tutorial for Reinforcement Learning. εΌΊεε¦δΉ ε ₯ι¨ζη¨.β149Updated last year
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" prβ¦β105Updated last year
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"β179Updated 2 months ago
- A curated list of RL resourcesβ40Updated last year
- β90Updated 2 years ago
- Improving Math reasoning through Direct Preference Optimization with Verifiable Pairsβ13Updated 3 months ago
- Python code to implement LLM4Teach, a policy distillation approach for teaching reinforcement learning agents with Large Language Modelβ43Updated last year
- Implemention of the Decision-Pretrained Transformer (DPT) from the paper Supervised Pretraining Can Learn In-Context Reinforcement Learniβ¦β68Updated last year
- Official code for "Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning".β48Updated last year
- Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)β185Updated last year
- A Framework for LLM-based Multi-Agent Reinforced Training and Inferenceβ136Updated last week
- Tracking literature and additional online resources on transformers for sequential decision making including RL and beyond.β46Updated 2 years ago