davendw49 / llm_training_full_stackLinks
π Full Stack Practice of the Large Language Model Training @ RLChina 2024
β40Updated 11 months ago
Alternatives and similar repositories for llm_training_full_stack
Users that are interested in llm_training_full_stack are comparing it to the libraries listed below
Sorting:
- A comprehensive list of PAPERS, CODEBASES, and, DATASETS on Decision Making using Foundation Models including LLMs and VLMs.β377Updated last year
- Monitoring recent cross-research on LLM & RL on arXiv for control. If there are good papers, PRs are welcome.β488Updated last year
- A collection of LLM with RL papersβ277Updated last year
- β13Updated last year
- Code release for "Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search" published at NeurIPS '24.β11Updated 7 months ago
- [NeurIPS 2023] We use large language models as commonsense world model and heuristic policy within Monte-Carlo Tree Search, enabling bettβ¦β283Updated 10 months ago
- [NeurIPS 2023] Large Language Models Are Semi-Parametric Reinforcement Learning Agentsβ37Updated last year
- A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language Modelsβ49Updated 5 months ago
- An index of algorithms for reinforcement learning from human feedback (rlhf))β93Updated last year
- LLM-PySC2 is NKAI Decision Team and NUDT Decision Team's Python component of the StarCraft II LLM Decision Environment. It exposes Deepmiβ¦β138Updated 5 months ago
- Direct preference optimization with f-divergences.β14Updated 10 months ago
- Implementation of TWOSOMEβ79Updated 8 months ago
- β84Updated 2 years ago
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"β136Updated 2 months ago
- A Framework for LLM-based Multi-Agent Reinforced Training and Inferenceβ257Updated last week
- Benchmarking LLMs' Gaming Ability in Multi-Agent Environmentsβ88Updated 4 months ago
- Awesome In-Context RL: A curated list of In-Context Reinforcement Learning - - ββ230Updated 2 weeks ago
- AdaRefiner: Refining Decisions of Language Models with Adaptive Feedback (NAACL 2024)β18Updated last year
- Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)β193Updated last year
- Must-read Papers on Large Language Model (LLM) as Optimizers and Automatic Optimization for Prompting LLMs.β248Updated last year
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"β190Updated 5 months ago
- β16Updated 11 months ago
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.β324Updated last year
- This is the official implementation of paper "Leveraging Dual Process Theory in Language Agent Framework for Simultaneous Human-AI Collabβ¦β41Updated 3 months ago
- DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agentsβ25Updated last month
- A large-scale multi-modal pre-trained modelβ132Updated 2 years ago
- TextStarCraft2,a pure language env which support llms play starcraft2β288Updated 5 months ago
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" prβ¦β110Updated last year
- Must-read Papers on Large Language Model (LLM) Planning.β429Updated last year
- (ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and trainingβ280Updated last year