davendw49 / llm_training_full_stackLinks
📖 Full Stack Practice of the Large Language Model Training @ RLChina 2024
☆39Updated 9 months ago
Alternatives and similar repositories for llm_training_full_stack
Users that are interested in llm_training_full_stack are comparing it to the libraries listed below
Sorting:
- A comprehensive list of PAPERS, CODEBASES, and, DATASETS on Decision Making using Foundation Models including LLMs and VLMs.☆373Updated last year
- Monitoring recent cross-research on LLM & RL on arXiv for control. If there are good papers, PRs are welcome.☆449Updated 10 months ago
- A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language Models☆46Updated 3 months ago
- A collection of LLM with RL papers☆276Updated last year
- [NeurIPS 2023] Large Language Models Are Semi-Parametric Reinforcement Learning Agents☆34Updated last year
- An index of algorithms for reinforcement learning from human feedback (rlhf))☆92Updated last year
- ☆90Updated 2 years ago
- [NeurIPS 2023] We use large language models as commonsense world model and heuristic policy within Monte-Carlo Tree Search, enabling bett…☆279Updated 8 months ago
- Code release for "Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search" published at NeurIPS '24.☆11Updated 5 months ago
- Direct preference optimization with f-divergences.☆14Updated 8 months ago
- ☆14Updated 9 months ago
- LLM-PySC2 is NKAI Decision Team and NUDT Decision Team's Python component of the StarCraft II LLM Decision Environment. It exposes Deepmi…☆135Updated 2 months ago
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆129Updated last week
- Tutorial4RL: Tutorial for Reinforcement Learning. 强化学习入门教程.☆157Updated last year
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆182Updated 3 months ago
- TextStarCraft2,a pure language env which support llms play starcraft2☆282Updated 2 months ago
- ☆61Updated last year
- ☆66Updated last year
- Must-read Papers on Large Language Model (LLM) Planning.☆422Updated last year
- ☆40Updated 11 months ago
- ☆79Updated last year
- Implementation of TWOSOME☆77Updated 6 months ago
- ☆12Updated last year
- DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents☆25Updated 4 months ago
- A survey of Preference Reinforcement Learning☆9Updated 2 years ago
- A large-scale multi-modal pre-trained model☆132Updated 2 years ago
- Awesome RL Reasoning Recipes ("Triple R")☆747Updated last month
- 此项目中将上传我在B站《强化学习理论基础》系列视频中的板书、参考资料等 内容。☆77Updated 2 years ago
- Awesome In-Context RL: A curated list of In-Context Reinforcement Learning - - —☆202Updated last month
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆318Updated 11 months ago