davendw49 / llm_training_full_stackLinks
📖 Full Stack Practice of the Large Language Model Training @ RLChina 2024
☆40Updated last year
Alternatives and similar repositories for llm_training_full_stack
Users that are interested in llm_training_full_stack are comparing it to the libraries listed below
Sorting:
- A comprehensive list of PAPERS, CODEBASES, and, DATASETS on Decision Making using Foundation Models including LLMs and VLMs.☆381Updated last year
- Monitoring recent cross-research on LLM & RL on arXiv for control. If there are good papers, PRs are welcome.☆508Updated 2 weeks ago
- A collection of LLM with RL papers☆278Updated last year
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆139Updated 3 weeks ago
- Code release for "Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search" published at NeurIPS '24.☆15Updated 8 months ago
- [NeurIPS 2023] Large Language Models Are Semi-Parametric Reinforcement Learning Agents☆38Updated last year
- Direct preference optimization with f-divergences.☆14Updated last year
- papers related to Direct Preference Optimization(DPO)☆19Updated last year
- ☆13Updated last year
- A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language Models☆49Updated 7 months ago
- [ICML 2025] "From Debate to Equilibrium: Belief-Driven Multi-Agent LLM Reasoning via Bayesian Nash Equilibrium"☆28Updated 4 months ago
- [NeurIPS 2023] We use large language models as commonsense world model and heuristic policy within Monte-Carlo Tree Search, enabling bett…☆290Updated last year
- LLM-PySC2 is NKAI Decision Team and NUDT Decision Team's Python component of the StarCraft II LLM Decision Environment. It exposes Deepmi…☆141Updated 6 months ago
- An index of algorithms for reinforcement learning from human feedback (rlhf))☆92Updated last year
- Benchmarking LLMs' Gaming Ability in Multi-Agent Environments☆88Updated 6 months ago
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆197Updated 7 months ago
- Must-read Papers on Large Language Model (LLM) as Optimizers and Automatic Optimization for Prompting LLMs.☆251Updated last year
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆327Updated last year
- A Framework for LLM-based Multi-Agent Reinforced Training and Inference☆341Updated last week
- ☆16Updated last year
- Implementation of TWOSOME☆82Updated 10 months ago
- Reinforced Multi-LLM Agents training☆58Updated 5 months ago
- Official implementation of the NeurIPS 2024 paper CORY☆22Updated 8 months ago
- [EMNLP 2024 Main] Official implementation of the paper "Unveiling In-Context Learning: A Coordinate System to Understand Its Working Mech…☆16Updated last year
- Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)☆194Updated last year
- ☆179Updated 10 months ago
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆364Updated last month
- AdaRefiner: Refining Decisions of Language Models with Adaptive Feedback (NAACL 2024)☆18Updated last year
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆381Updated 4 months ago
- Awesome In-Context RL: A curated list of In-Context Reinforcement Learning - - —☆251Updated 2 months ago