schinger / FullLLMLinks
Full stack LLM (Pre-training/finetuning, PPO(RLHF), Inference, Quant, etc.)
☆30Updated 11 months ago
Alternatives and similar repositories for FullLLM
Users that are interested in FullLLM are comparing it to the libraries listed below
Sorting:
- Reinforcement Learning in LLM and NLP.☆62Updated last month
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆283Updated 11 months ago
- a-m-team's exploration in large language modeling☆195Updated 8 months ago
- ☆41Updated 10 months ago
- Train your grpo with zero dataset and low resources, 8bit/4bit/lora/qlora supported, multi-gpu supported ...☆79Updated 9 months ago
- The related works and background techniques about Openai o1☆221Updated last year
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆117Updated 2 years ago
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆153Updated 3 months ago
- Code for a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models☆67Updated 11 months ago
- 在verl上做reward的定制开发☆144Updated 8 months ago
- ☆147Updated last year
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆94Updated 2 months ago
- ☆115Updated last year
- Llama-3-SynE: A Significantly Enhanced Version of Llama-3 with Advanced Scientific Reasoning and Chinese Language Capabilities | 继续预训练提升 …☆37Updated 8 months ago
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆80Updated 2 years ago
- A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or l…☆284Updated 2 years ago
- 怎么训练一个LLM分词器☆153Updated 2 years ago
- InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning☆284Updated 2 years ago
- Fantastic Data Engineering for Large Language Models☆93Updated last year
- ☆48Updated 11 months ago
- Collection of papers for scalable automated alignment.☆93Updated last year
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆414Updated 6 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆269Updated last year
- ☆332Updated 8 months ago
- ☆81Updated 2 months ago
- LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA☆237Updated 5 months ago
- ☆427Updated 3 months ago
- 使用单个24G显卡,从0开始训练LLM☆56Updated 6 months ago
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆406Updated 4 months ago
- ☆163Updated 3 months ago